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Description 

TECHNICAL FIELD 

5 [0001 ] The present invention relates to a novel pi ,3-N-acetyl-D-galactosaminyltransferase protein and a nucleic acid 
encoding the same, as well as a canceration assay using the same, etc. 

BACKGROUND ART 

w [0002] Recent attention has been focused on the in vivo roles of sugar chains and/or complex carbohydrates. For 
example, factors for determining blood types are glycoproteins, and it is glycolipids that are involved in the functions 
of the nervous system. Thus, enzymes having the ability to synthesize sugar chains constitute an extremely important 
key to analyzing physiological activities provided by various sugar chains. 

[0003] For example, N-acetyl-D-galactosamine (hereinafter also referred to as "GalNAc") is among the components 
is constituting glycosaminoglycans, as well as being a sugar residue found in various sugar chain structures such as 
glycosphingolipids and mucin-type sugar chains. Thus, an enzyme transferring GalNAc will serve as an extremely 
important tool in analyzing the roles of sugar chains in various tissues in vivo. 

[0004] As described above, attention has been focused on the in vivo roles of sugar chains, but it cannot be said 
that sufficient headway has been made in analyzing in wVosugarchain synthesis. This is in part because the mechanism 
20 of sugar chain synthesis and the in vivo localization of sugar synthesis have not been fully analyzed. In analyzing the 
mechanism of sugar chain synthesis, it is necessary to analyze glycosylation enzymes (particularly glycosyltransferas- 
es) and to analyze what kind of sugar chains are synthesized by means of the enzymes. To this end, there is a strong 
demand for searching novel glycosyltransferases and analyzing their functions. 

[0005] There are some reports of glycosyltransferases having the ability to transfer GalNAc (Non-patent Documents 
25 1 to 4). For example, among human GalNAc transferases, enzymes transferring GalNAc with "01 ,4 linkage" are known 
(Non-patent Document 1) and enzymes using "galactose" as their acceptor substrate are known as enzymes trans- 
ferring GalNAc with (31 ,3 linkage (Non-patent Document 2) ("f}1 ,3" or "(33" as used herein refers to a glycosidic linkage 
between an a-hydroxyl group at the 1 -position of a sugar residue in an acceptor substrate and a hydroxyl group at the 
3-position of a sugar residue to be transferred and linked thereto). 
30 [0006] On the other hand, in higher organisms like humans, no enzyme is known to transfer GalNAc with "p1 ,3 
linkage" to "N-acetylglucosamine" (hereinafter also referred to as "GlcNAc"). 

[0007] Although there is a report showing that the sugar chain structure in which GalNAc and GlcNAc are linked in 
a p1 ,3 fashion was confirmed in sugar chains on neutral glycolipids of fly, a kind of arthropod (Non-patent Document 
5), it has been believed that such a sugar chain structure is not present in mammals, particularly in humans, to begin 
35 with. 

Patent Document 1 

International Patent Publication No. WO 01/79556 
Non-patent Document 1 

40 Cancer Res. 1993 Nov 15; 53(22):5395-400: Yamashiro S, Ruan S, Furukawa K, Tai T, Lloyd KO, Shiku H, Fuji- 

kawa K. Genetic and enzymatic basis forthe differential expression of GM2 and GD2 gangliosides in human cancer 
cell lines. 

Non-patent Document 2 

Biochim Biophys Acta. 1995 Jan 3; 1 254(1 ):56-65: Taga S, Tetaud C, Mangeney M, Tursz T, Wiels J. Sequential 
45 changes in glycolipid expression during human B cell, differentiation: enzymatic bases. 

Non-patent Document 3 

Proc Natl Acad Sci USA. 1996 Oct 1; 93(20):1 0697-702: Haslam DB, Baenziger JU. Related Articles, Links, 
Expression cloning of Forssman gly colipid synthetase: a novel member of the histo-blood group ABO gene family. 
Non-patent Document 4 

50 j Biol Chem. 1 997 Sep 1 9; 272(38): 23503-1 4: Wandall HH, Hassan H, Mirgorodskaya E, Kristensen AK, Roepstorff 

P, Bennett EP, Nielsen PA : Holiingsworth MA, Burchell J, Taylor- Papadimitriou J, Clausen H. Substrate specif icities 
of three members of the human, UDP-N-acetyl-alpha-D-galactosamine: Polypeptide N-acetylgalactosaminyltrans- 
f erase family, GalNAc-T1 , -T2, and -T3. 
Non-patent Document 5 

55 j. Biochem. (Tokyo) 1990 June; 107(6); 899-903: Sugita M. Inagaki F, Naito H, Hori T, Studies on glycosphingol- 

ipids in larvae of the green-bottle fly, Lucilia caesar: two neutral glycosphingolipids having large straight oligosac- 
caride chains with eight and nine sugars. 
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DISCLOSURE OF THE INVENTION 

[0008] A problem to be solved by the present Invention is to provide a polypeptide which is a mammal-derived (par- 
ticularly human-derived) glycosyltransferase and which has a novel transferase activity to transfer GalNAc with pi ,3 
5 linkage to GlcNAc, as well as a nucleic acid encoding such a polypeptide, etc. 

[0009] Another problem to be solved by the present invention is to provide a transformant expressing the nucleic 
acid in host cells, a method for producing the encoded protein by allowing the transformant to produce the protein and 
then collecting the protein, and an antibody recognizing the protein. 

[0010] On the other hand, since sugar chain synthesis may be affected by canceration, the identification and expres- 
10 sion analysis of such a glycosylation enzyme can be expected to provide an index useful for cancer diagnosis, etc. 
The present invention also provides detailed procedures and criteria usefu I for canceration assay or the like by analyzing 
and comparing, at the tissue or cell line level, the transcription level of such a protein which varies in correlation with 
canceration or malignancy. 

15 BRIEF DESCRIPTION OF DRAWINGS 

[0011] 

Figure 1 is a diagram showing changes in the activity of the G34 enzyme protein according to this example, plotted 
20 against the reaction time. 

Figure 2A shows the results of NMR measurement, used for analysis of the sugar chain structure synthesized by 
the G34 enzyme protein according to this example. 

Figure 2B shows a partial magnified view of the NMR results in Figure 2A. 

Figure 3 is a table summarizing NOE in NMR shown in Figure 2. Various conditions for the data in Table 1 are as 
25 follows: 1 .08 mM, 298K, D 2 0, CH 2 (high) = 4.557 ppm for non-marked data, chemical shifts for data marked with 

* are CH 2 (low) = 4.778 ppm, phenyl(ortho) = 7.265 ppm, phenyl(meta) = 7.354 ppm and phenyl(para) = 7.320 
ppm, calculated from the 1 D spectrum. 

Figure 4 is a table summarizing relevant data (tentative NOE) for each pyranose with respect to NMR shown in 
Figure 2 (s: strong, m: medium, w: weak, vw: very weak, A: GlcNAc, B: GalNAc). 
30 Figure 5 shows a comparison of amino acid sequences between G34 enzyme protein according to this example 

and known (53Gal transferases. 

Figure 6 shows a comparison of motifs involved in the p3-linking activity between G34 enzyme protein according 
to this example and various known p3-linking glycosyltransf erases. "b3" represents a p1-3 linkage and "Gn" rep- 
resents GlcNAc. 

35 Figure 7 is a diagram showing the pH dependence of the activity of the G34 enzyme protein according to this 

example. 

Figure 8 is a diagram showing ion requirement for the activity of the G34 enzyme protein according to this example. 
Figure 9 presents graphs showing the expression levels of the G34 enzyme protein according to this example in 
human cell lines. 

40 Figure 10 shows amino acid sequence alignment between mouse G34 according to this example (upper) and 

human G34 (lower). 

Figure 11 shows the result of in situ hybridization performed on a mouse testis sample using the mG34 nucleic 
acid according to this example. 

^5 DETAILED DESCRIPTION OF THE INVENTION 

[0012] To solve the problems stated above, the inventors of the present invention have attempted to isolate and 
purify a nucleic acid of interest, which may have high sequence identity, on the basis of the nucleotide sequence of an 
enzyme gene functionally similar to the intended enzyme. More specifically, first, the sequence of a known glycosyl- 
50 transferase p3 gal actosy transferase 6 (p3Ga1 T6) was used as a query for a BLAST search to thereby find a sequence 
with homology (GenBank No. AX285201). It should be noted thatthis nucleotide sequence was known as the sequence 
of SEQ ID NO: 1006 disclosed in International Publication No. WO 01/79556 (Patent Document 1 listed above), but 
its activity remained unknown. 

[0013] First, the inventors of the present invention have independently cloned the above gene by PCR, have deter- 
55 mined its nucleotide sequence (SEQ ID NO: 1 ) and putative amino acid sequence (SEQ ID NO: 2), and have succeeded 
in identifying a certain biological activity of a polypeptide encoded by the nucleic acid, thus completing the present 
invention. Moreover, when using the sequence as a query to search mouse genes, the inventors have found the nu- 
cleotide sequence of SEQ ID NO: 3 and its putative amino acid sequence (SEQ ID NO: 4). 
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[0014] The gene having the nucleotide sequence of SEQ ID NO: 1 and the protein having the amino acid sequence 
of SEQ ID NO: 2 were designated human G34, while the gene having the nucleotide sequence of SEQ ID NO: 3 and 
the protein having the amino acid sequence of SEQ ID NO: 4 were designated mouse G34. 

[0015] According to the studies of the inventors, the above G34 protein uses an N-acetyl-D-galactosamine residue 
as a donor substrate and an N-acetyl-D-glucosamine residue as an acceptor substrate. As detailed later in Example 
2, the G34 protein was found to retain three motifs in its amino acid sequence, which are well conserved in the enzyme 
family transferring various sugars (e.g. , galactose, N-acetyl-D-glucosamine) in the linking mode of p1 ,3. In light of these 
points, the G34 protein was unexpectedly believed to have transferase activity to synthesize a novel sugar chain struc- 
ture "GalNAc-(J1 ,3-GlcNAc," for which no report has been made for mammals, particularly humans. The linking mode 
was actually confirmed by NMR. 

[0016] Namely, the present invention relates to a p1 ,3-N-acetyl-D-galactosaminyltransf erase protein which transfers 
N-acetyl-D-galactosamine to N-acetyl-D-glucosamine with pi,3 linkage. 

[0017] An enzyme protein according to a preferred embodiment of the present invention may have at least one or 
any combination of the following properties (a) to (c). 

(a) Acceptor substrate specificity 

[0018] When using an oligosaccharide as an acceptor substrate, the enzyme protein shows transferase activity to- 
ward Bz-p-GlcNAc, GlcNAc-p1-4-GlcNAc-(3-Bz, Gal-p1-3 (GlcNAc-p1 -6) GalNAc-ot-pNp, GlcNAc-p1-3 GalNAc-cc-pNp 
and GlcNAc-p1 -6GalNAc-a-pNp ("GlcNAc" represents an N-acetyl-D-glucosamine residue, "GalNAc" represents an 
N-acetyl-D-galactosamine residue, "Bz" represents a benzyl group, "pNp" represents a p-nitrophenyl group, and "-" 
represents a glycosidic linkage. Numbers in these formulae each represent the carbon number in the sugar ring where 
a glycosidic linkage is present, and "a" and n p" represent anomers of the glycosidic linkage at the 1 -position of the 
sugar ring. An anomer whose positional relationship with CH 2 OH or CH 3 at the 5-position is trans and cis is represented 
by "a" and "P", respectively). 

[0019] Preferably, the enzyme protein is substantially free from transferase activity toward Bz-a-GlcNAc and Gal 
p1-3 GlcNAc-p-pNp. 

(b) Reaction pH 

[0020] The activity is lower in a pH range of 6.2 to 6.6 than in other pH ranges. 

(c) Divalent ion requirement 

[0021] Although the above activity is enhanced at least in the presence of Mn 2+ , Co 2 + or Mg 2 +, the Mn 2+ -induced 
enhancement of the activity is almost completely eliminated in the presence of Cu 2+ . 

[0022] Moreover, in a preferred embodiment of the above glycosy transferase protein, the glycosyltransf erase protein 
of the present invention comprises the following polypeptide (A) or (B): 

(A) a polypeptide which has the amino acid sequence shown in SEQ ID NO: 2 or 4; or 

(B) a polypeptide which has an amino acid sequence with substitution, deletion or insertion of one or more amino 
acids in the amino acid sequence shown in SEQ ID NO: 2 or 4 and which transfers N-acetyl-D-galactosamine to 
N-acetyl-D-glucosamine with p1 ,3 linkage. 

[0023] Moreover, in a more preferred embodiment of the above glycosyltransferase protein, the above polypeptide 
(A) is a glycosyltransferase protein consisting of a polypeptide having an amino acid sequence covering amino acids 
189 to 500 shown in SEQ ID NO: 2. Likewise, in an even more preferred embodiment of the above glycosyltransferase 
protein, the above polypeptide (A) is a glycosyltransferase protein consisting of a polypeptide having an amino acid 
sequence covering amino acids 36 to 500 shown in SEQ ID NO: 2. 

[0024] In addition, other embodiments of the glycosyltransferase protein of the present invention encompass proteins 
consisting of polypeptides having amino acid sequences sharing at least more than 30% identity, preferably at least 
40% identity, and more preferably at least 50% identity with an amino acid sequence covering amino acids 1 89 to 500 
shown in SEQ ID NO: 2 or amino acids 35 to 504 shown in SEQ ID NO: 4. 

[0025] In another aspect, the present invention provides a nucleic acid consisting of a nucleotide sequence encoding 
any one of the above polypeptides or a nucleotide sequence complementary thereto. 

[0026] tn a preferred embodiment, the nucleic acid encoding the protein of the present invention is a nucleic acid 
consisting of the nucleotide sequence shown in SEQ ID NO: 1 or 3 or a nucleotide sequence complementary to at 
least one of them. More preferably, in the case of human origin, such a nucleic acid consists of a nucleotide sequence 
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covering nucleotides 565 to 1 503 shown in SEQ ID NO: 1 or a nucleotide sequence complementary thereto, and most 
preferably consists of a nucleotide sequence covering nucleotides 1 06 to 1 503 shown in SEQ ID NO: 1 or a nucleotide 
sequence complementary thereto. In the case of mouse origin, such a nucleic acid consists of a nucleotide sequence 
covering nucleotides 103 to 1512 shown in SEQ ID NO: 3 or a nucleotide sequence complementary thereto. 
5 [0027] Embodiments of the above nucleic acids according to the present invention encompass DNA. 

[0028] The present invention further provides a vector carrying any one of the above nucleic acids and a transformant 
containing the vector. 

[0029] In yet another aspect, the present invention provides a method for producing a £1 ,3-N-acetyl-D-galactosam- 
inyltransf erase protein, which comprises growing the above transformant to express the above glycosyltransferase 
10 protein and collecting the glycosyltransferase protein from the grown transformant. 

[0030] In yet another aspect, the present invention provides an antibody recognizing any one of the above pi ,3-N- 
acetyl-D-galactosaminyltransferase proteins. 

[0031] On the other hand, in response to the discovery of the above G34, the inventors of the present invention have 
clarified that the expression level of G34 mRNA is increased significantly in cancerous tissues and ceil lines. 
15 [0032] Thus, the present invention also provides a nucleic acid for measurement, which is useful as an index of 
canceration or malignancy and which hybridizes under stringent conditions to the nucleotide sequence shown in SEQ 
ID NO: 1 or 3 or a nucleotide sequence complementary to at least one of them. 

[0033] The nucleic acid for measurement of the present invention may typically consist of a nucleotide sequence 
covering at least a dozen contiguous nucleotides in the nucleotide sequence shown in SEQ ID NO: 1 or 3 or a nucleotide 
20 sequence complementary thereto. 

[0034] In a preferred embodiment, the nucleic acid for measurement of the present invention encompasses a probe 
consisting of the nucleotide sequence shown in SEQ ID NO: 16 or a nucleotide sequence complementary thereto, as 
well as a primer set consisting of the following nucleotide sequences (1) or (2): 

25 (1) a pair of the nucleotide sequences shown in SEQ ID NOs: 14 and 15; or 

(2) a pair of the nucleotide sequences shown in SEQ ID NOs: 1 7 and 1 8. 

[0035] Also, the nucleic acid for measurement of the present invention may be used as a tumor marker. 
[0036] The present invention further provides a method for assaying canceration in a biological sample, which com- 
30 prises: 

(a) using any one of the above nucleic acids to measure the transcription level of the nucleic acid in the biological 
sample; and 

(b) determining whether the measured value is significantly higher than that of a normal biological sample. 

35 

[0037] In a preferred embodiment, the canceration assay of the present invention includes cases where the meas- 
urement of the transcription level is made by hybridization or PCR targeted at the above biological sample and using 
any one of the above nucleic acids. 

[0038] In a further aspect of the canceration assay of the present invention, the present invention provides a method 
40 for assaying the effectiveness of treatment in cancer therapy, which comprises using any one of the above nucleic 
acids to measure the transcription level of the nucleic acid in a biological sample treated by cancer therapy, and de- 
termining whether the measured value is significantly lower than that obtained before treatment or than that of an 
untreated sample. 

[0039] In particular, the above biological sample may be derived from the large intestine (colon) or lung. 

45 

MODE FOR CARRYING OUT THE INVENTION 

[0040] The mode for carrying out the present invention will be described in detail below. 

50 (1) Nucleic acid encoding the G34 enzyme protein of the present invention 

[0041] Based upon the above discovery, the inventors of the present invention expressed the G34 enzyme protein 
encoded by the nucleic acid , isolated and purified the protein, and further identified its enzymatic activity. When focusing 
on the fact that an amino acid sequence having the desired enzymatic activity was identified, the nucleotide sequence 
55 of SEQ ID NO: 1 or 3 is one embodiment of a nucleic acid encoding the isolated polypeptide having the enzymatic 
activity. This means that the nucleic acid of the present invention encompasses all, but a limited number of, nucleic 
acids having degenerate nucleotide sequences capable of encoding the same amino acid sequence for the G34 en- 
zyme protein. 
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[0042] The present invention also provides a nucleic acid encoding the full-length or a fragment of a polypeptide 
consisting of a novel amino acid sequence as mentioned above. A typical nucleic acid encoding such a novel polypep- 
tide may have the nucleotide sequence shown in SEQ ID NO: 1 or 3 or a nucleotide sequence complementary to at 
least one of them. 

5 [0043] The nucleic acid of the present invention also encompasses both single-stranded and double-stranded DNA 
and their complementary RNA. Examples of DNA include naturally-occurring DNA, recombinant DNA, chemically- 
bound DNA, PCR-amplified DNA, and combinations thereof. However, DNA is preferred in terms of stability during 
vector and/or transformant preparation. 

[0044] The nucleic acid of the present invention may be prepared in the following manner, by way of example. 

10 [0045] First, the known sequence under GenBank No. AX285201 or a part thereof may be used to perform nucleic 
acid amplification on a cDNA library in a routine manner using basic procedures for genetic engineering (e.g., hybrid- 
ization, nucleic acid amplification), thereby cloning the nucleic acid of the present invention. Since the nucleic acid may 
be obtained, e.g., as a DNA fragment of approximately 1.5 kbp as a PCR product, the fragment may be separated 
using techniques for screening DNA fragments based on their molecular weight (e.g., agarose gel electrophoresis) 

*5 and isolated in a routine manner, e.g. using techniques for excising a specific band. 

[0046] Moreover, according to the putative amino acid sequence (SEQ ID NO: 2 or 4) of the isolated nucleic acid, 
the nucleic acid may be estimated to have a hydrophobic transmembrane region at its N-terminal end. By preparing a 
region of a nucleotide sequence encoding a polypeptide free from this transmembrane region, it is also possible to 
obtain the nucleic acid of the present invention that encodes a soluble form of the polypeptide. 

20 [0047] Based on the nucleotide sequence of the nucleic acid disclosed herein, it is easy for those skilled in the art 
to create appropriate primers from nucleotide sequences located at both ends of a nucleic acid of interest or a region 
thereof to be prepared and to use the primers thus created for nucleic acid amplification to amplify and prepare the 
region of interest. 

[0048] The above nucleic acid amplification includes, for example, reactions requiring thermal cycling such as 
polymerase chain reaction (PCR)[Saiki R.K., et al., Science, 230, 1350-1354(1985)], ligase chain reaction (LCR) [Wu 
D. Y„ et al., Genomics, 4, 560-569 (1989); Barringer K. J., et al., Gene, 89, 117-122 (1990); Barany F., Proc. Natl. 
Acad. Sci. USA, 88, 189-193 (1991)] and transcription-based amplification [Kwoh D. Y., et al., Proc. Natl. Acad. Sci. 
USA, 86, 1173-1177 (1989)], as well as isothermal reactions such as strand displacement amplification (SDA) [Walker 
G. T, et al., Proc. Natl. Acad. Sci. USA, 89, 392-396 (1992); Walker G. T, et al., Nuc. Acids Res., 20, 1691-1696 
30 (1992)], self-sustained sequence replication (3SR) [Guatelli J. C, Proc. Natl. Acad. Sci. USA, 87, 1874-1878 (1990)] 
and Qp replicase system [Lizardi et al., BioTechnology 6, p.1 1 97-1 202 (1 988)]. It is also possible to use other reactions, 
e.g., nucleic acid sequence-based amplification (NASBA) through competitive amplification between a target nucleic 
acid and a mutated sequence, found in European Patent No. 0525882. Preferred is PCR. 

[0049] The use of the nucleic acid of the present invention also enables the expression of the intended enzyme 
35 protein or the provision of probes and antisense primers for the purpose of medical research or gene therapy, as 
described later. 

[0050] Those skilled in the art will be able to obtain a nucleic acid as useful as the sequence of SEQ ID NO: 1 or 3 
by preparing a nucleic acid consisting of a nucleotide sequence sharing a certain homology with the nucleotide se- 
quence of SEQ ID NO: 1 or 3. For example, the homologous nucleic acid of the present invention encompasses nucleic 
*o acids encoding proteins which share homology with the amino acid sequence shown in SEQ ID NO: 2 or 4 and which 
have the ability to transfer N-acetyi-D-galactosamine to N-acetyl-D-glucosamine with p1 ,3 linkage. 
[0051] To identify the range of nucleic acids encoding such homologous proteins according to the present invention, 
an identity search is performed for the nucleic acid sequence shown in SEQ ID NO: 1 or 3 of the present invention, 
indicating that the nucleic acid sequence shares 40% identity with the nucleic acid sequence of a known |31 ,4GalNAc 
transferase showing the highest homology (Non-patent Document 1 listed above) and also shares 40% identity with 
the nucleic acid sequence of a known p1 ,3Gal transferase showing the highest homology (Non-patent Document 2 
listed above). In light of these points, a preferred nucleic acid sequence encoding the homologous protein of the present 
invention typically shares more than 40% identity, more preferably at least 50% identity, and particularly preferably at 
least 60% identity with any one of the entire nucleotide sequence of SEQ ID NO: 1 or 3, preferably a partial nucleotide 
50 sequence consisting of nucleotides 1 06 to 1503 in SEQ ID NO: 1 , preferably a partial nucleotide sequence consisting 
of nucleotides 103 to 1512 in SEQ ID NO: 3, or nucleotide sequences complementary to these sequences. 
[0052] Likewise, the nucleotide sequences shown in SEQ ID NOs: 1 and 3 share 86% identity with each other. In 
light of this point, a preferred nucleic acid sequence encoding the homologous protein of the present invention can be 
defined as sharing at least 86%, preferably 90% identity with any one of the entire nucleotide sequence of SEQ ID NO: 
55 1, preferably nucleotides 106 to 1503, or a nucleotide sequence complementary thereto. 

[0053] The above percentage of identity may be determined by visual inspection and mathematical calculation. Al- 
ternatively, the percentage of identity between two nucleic acid sequences may be determined by comparing sequence 
information using the GAP computer program, version 6.0, described by Devereux et al., Nucl. Acids Res. 12: 387, 
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1 984 and available from the University of Wisconsin Genetics Computer Group (UWGCG). The preferred default pa- 
rameters for the GAP program include: (1 ) a unary comparison matrix (containing a value of 1 for identities and 0 for 
non-identities) for nucleotides, and the weighted comparison matrix of Gribskov and Burgess, Nucl. Acids Res. 14: 
6745 : 1986, as described by Schwartz and Dayhoff, eds., Atlas of Protein Sequence and Structure, pp. 353-358, 
5 National Biomedical Research Foundation, 1979; (2) a penalty of 3.0 for each gap and an additional 0.1 0 penalty for 
each symbol in each gap; and (3) no penalty for end caps. It is also possible to use other sequence comparison 
programs used by those skilled in the art. 

[0054] Other nucleic acids homologous as the structural gene of the present invention typically include nucleic acids 
which hybridize under stringent conditions to a nucleotide consisting of a nucleotide sequence within SEQ ID NO: 1 
10 or 3, preferably a nucleotide sequence consisting of nucleotides 1 06 to 1 503 of SEQ ID NO: 1 , preferably a nucleotide 
sequence consisting of nucleotides 103 to 1512 of SEQ ID NO: 3, or a nucleotide sequence complementary thereto 
and which encode polypeptides having the ability to transfer N-acetyl-D-galactosamine to N-acetyl-D-glucosamine with 
(31 ,3 linkage. 

[0055] As used herein, "under stringent conditions" means that a nucleic acid hybridizes under conditions of moderate 
or high stringency. More specifically, conditions of moderate stringency may readily be determined by those having 
ordinary skill in the art, e.g., depending on the length of DNA. Primary conditions can be found in Sambrook et al., 
Molecular Cloning: A Laboratory Manual, 3rd edition, Vol. 1 , 7.42-7.45 Cold Spring Harbor Laboratory Press, 2001 and 
include the use of a prewashing solution for nitrocellulose filters 5 x SSC, 0.5% SDS, 1 .0 mM EDTA (pH 8.0), hybrid- 
ization conditions of about 50% formamide, 2 x SSC to 6 x SSC at about 40-50°C (or other similar hybridization 
20 solutions, such as Stark's solution, in about 50% formamide at about 42°C) and washing conditions of about 60°C, 0.5 
x SSC, 0.1% SDS. Conditions of high stringency can also be readily determined by those skilled in the art, e.g., 
depending on the length of DNA. In general, such conditions include hybridization and/or washing at a higher temper- 
ature and/or at a lower salt concentration than that required under conditions of moderate stringency and, for example, 
are defined as hybridization conditions as above and with washing at about 68°C, 0.2 x SSC, 0.1 % SDS. Those skilled 
25 jn the art will recognize that the temperature and washing solution salt concentration can be adjusted as necessary 
according to factors such as the length of nucleotide sequences. 

[0056] As described above, those skilled in the art will readily determine and achieve conditions of suitably moderate 
or high stringency on the basis of common knowledge about hybridization conditions which are known in the art, as 
well as on the empirical rule which will be obtained through commonly used experimental means. 
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(2) Vector and transformant of the present invention 



[0057] The present invention provides a recombinant vector carrying the above nucleic acid. Procedures for inte- 
grating a DNA fragment of the nucleic acid into a vector (e.g., a plasmid) include those described in Sambrook, J. et 

35 al., Molecular Cloning, A Laboratory Manual (3rd edition), Cold Spring Harbor Laboratory, 1.1 (2001). For convenience, 
a commercially available ligation kit (e.g., a product of TaKaRa Shuzo Co., Ltd., Japan) may be used. 
[0058] The recombinant vector (e.g., recombinant plasmid) thus obtained may be introduced into host cells (e.g., E. 
coii DH5oc, TB1 , LE392, or XL-LE392 or XL-1 Blue). Procedures for introducing the plasmid into host cells include those 
described in Sambrook, J. etal., Molecular Cloning, A Laboratory Manual (3rd edition), Cold Spring Harbor Laboratory, 

40 16.1 (2001), exemplified by the calcium chloride method or the calcium chloride/rubidium chloride method, electropo- 
ration, electro injection, chemical treatment (e.g., PEG treatment), and the gene gun method. 

[0059] A vector which can be used may be prepared readily by linking a desired gene to a recombination vector 
available in the art (e.g., plasmid DNA) in a routine manner. Specific examples of a vector to be used include, but are 
not limited to, E. co//-derived plasmids such as pDONR201, pBluescript, pUC18, pUC19 and pBR322. 

45 [0060] Those skilled in the art will be able to select appropriate restriction ends to fit into the intended expression 
vector. The expression vector may be selected appropriately by those skilled in the art such that the vector is suitable 
for host cells where the enzyme of the present invention is to be expressed. Moreover, the expression vector is pref- 
erably constructed to allow regions involved in gene expression (e.g., promoter region, enhancer region and operator 
region) to be properly located to ensure expression of the above nucleic acid in target host cells, so that the nucleic 

50 acid is properly expressed. 

[0061] The type of expression vector is not limited in any way as long as the vector allows expression of a desired 
gene in various prokaryotic and/or eukaryotic host cells and has the function of producing a desired protein. Preferred 
examples include pQE-30, pQE-60, pMAL-C2, pMAL-p2 and pSE420 for E. colt expression, pYES2 (Saccharomyces) 
and pPIC3.5K, pPIC9K and pA0815 (all Pichia) for yeast expression, as well as pFastBac, pBacPAK8/9, pBK283, 

55 pVL1 392 and pBlueBac4.5 for insect expression. 

[0062] To construct the expression vector a Gateway system (Invitrogen Corporation) may be used which does not 
require restriction treatment and ligation operation. The Gateway system is a site-specific recombination system which 
allows cloning while maintaining the orientation of PCR products and also allows subcloning of a DNA fragment into 
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a properly modified expression vector. More specifically, this system prepares an expression clone corresponding to 
the intended expression system by creating an entry clone from a PCR product and a donor vector by the action of a 
site-specific recombinase BP clonase and then transferring the PCR product to a destination vector which allows re- 
combination with this clone by the action of another recombinase LR clonase. One feature of this system is that a time- 
s and labor-consuming subcloning step which requires treatment with restriction enzymes and/or ligases can be elimi- 
nated when an entry clone is created to begin with. 

[0063] The above expression vector carrying the nucleic acid of the present invention may be integrated into host 
cells to give a transformant for producing the polypeptide of the present invention. In general, host cells used for 
obtaining the transformant may be either eukaryotic cells (e.g., mammalian cells, yeast, insect cells) or prokaryotic 

10 cells (e.g., E. colt, Bacillus subtilis). Also, cultured cells of human origin (e.g., HeLa, 293T, SH-SY5Y) or mouse origin 
(e.g., Neuro2a, NIH3T3) may be used for this purpose. All of these host cells are known and commercially available 
(e.g., from Dainippon Pharmaceutical Co., Ltd., Japan), or available from public research institutions (e.g., RIKEN Cell 
Bank). Alternatively, it is also possible to use embryos, organs, tissues or non-human individuals. 
[0064] Since the nucleic acid of the present invention was found from human genome libraries, it is believed that 

15 when eukaryotic cells are used as host cells, the G34 enzyme protein of the present invention may have properties 
close to native proteins (e.g., embodiments where glycosylation occurs). In light of this point, it is preferable to select 
eukaryotic ceils, particularly mammalian cells, as host cells. Specific examples of mammalian cells include animal cells 
of mouse, Xenopus laevis, rat, hamster, monkey or human origin or cultured cell lines established from these cells. E 
coli, yeast or insect cells available for use as host cells are specifically exemplified by E. coli (e.g., DH5a, M15, JM109, 

20 BL21), yeast (e.g., INVSd (Saccharomyces), GS115, KM71 (both Pichia)) or insect ceils (e.g., Sf21 , BmN4, silkworm 
larva). 

[0065] In general, an expression vector can be prepared by linking at least a promoter, an initiation codon, a gene 
encoding a desired protein, a termination codon and a terminator region to an appropriate replicable unit to give a 
continuous loop. In this case, if desired, it is also possible to use an appropriate DNA fragment (e.g., linkers, other 

25 restriction enzyme sites) through routine techniques such as digestion with a restriction enzyme and/or ligation using 
T4 DNA ligase. When bacterial (particularly E. coli) cells are used as host cells, an expression vector is generally 
composed of at least a promoter/operator region, an initiation codon, a gene encoding a desired protein, a termination 
codon, a terminator and a replicable unit. When yeast cells, plant cells, animal cells or insect cells are used as host 
cells, it is generally preferred that an expression vector comprises at least a promoter, an initiation codon, a gene 

30 encoding a desired protein, a termination codon and a terminator, in this case, the vector may also comprise DNA 
encoding a signal peptide, an enhancer sequence, 5 1 - and 3'-terminal untranslated regions of the desired gene, a 
selective marker region or a replicable unit, as appropriate. 

[0066] A replicable unit refers to DNA having the ability to replicate its entire DNA sequence in host cells and includes 
a native plasmid, an artificially modified plasmid (i.e. , a plasmid prepared from a native plasmid) and a synthetic plasmid. 

35 Examples of a preferred plasmid include plasmid pQE30, pET or pCAL or an artificially modified product thereof (i.e., 
a DNA fragment obtained from pQE30, pET or pCAL by treatment with an appropriate restriction enzyme) for E. coli 
cells, plasmid pYES2 or pPIC9K for yeast cells, as well as plasmid pBacPAK8/9 for insect cells. 
[0067] A methionine codon (ATG) may be given as an example of an initiation codon preferred for the vector of the 
present invention. Examples of a termination codon include commonly used termination codons (e.g., TAG, TGA, TAA). 

40 As for enhancer and terminator sequences, it is also possible to use those commonly used by those skilled in the art, 
such as SV40-derived enhancer and terminator sequences. 

[0068] As a selective marker, a commonly used one can be used in a routine manner. Examples include antibiotic 
resistance genes such as those resistant to tetracycline, ampicillin, or kanamycin or neomycin, hygromycin or spec- 
tinomycin. 

45 [0069] The introduction (also referred to as transformation or transfection) of the expression vector according to the 
present invention into host cells may be accomplished by using conventionally known techniques. Transformation may 
be accomplished, for example, by the method of Cohen et al. [Proc. Natl. Acad. Sci. USA, 69, 2110 (1 972)], the pro- 
toplast method [Mol. Gen. Genet., 168, 111 (1979)] or the competent method [J. Mol. Biol., 56, 209 (1 971)] for bacterial 
cells (e.g., E coli, Bacillus subtilis) and by the method of Hinnen et al. [Proc. Natl. Acad. Sci. USA, 75, 1927 (1978)] 

50 or the lithium method [J. B. Bacteriol., 153, 163 (1983)] for Saccharomyces cerevisiae. Transformation may also be 
accomplished, for example, by the leaf disk method [Science, 227, 129 (1985)] or eiectroporation [Nature, 319, 791 
(1986)] for plant cells, by the method of Graham et al. [Virology, 52, 456 (1973)] for animal cells, and by the method 
of Summer et al. [Mol. Cell Biol., 3, 2156-2165 (1983)] for insect cells. 

55 (3) G34 enzyme protein of the present invention 

[0070] As illustrated in the Example section described later, a polypeptide having a novel enzymatic activity can be 
isolated and purified, for example, by integrating a nucleic acid having the nucleotide sequence of SEQ ID NO: 1 or 3 
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into an expression vector and then expressing the nucleic acid. 

[0071] First, in light of the above point, a typical embodiment of the protein of the present invention is an isolated 
G34 enzyme protein consisting of the putative amino acid sequence shown in SEQ ID NO: 2 or 4. More specifically, 
this enzyme protein has the activities shown below. 

Catalytic reaction 

[0072] The enzyme protein allows transfer of "N-acetyl-D-galactosamine (GalNAc)" from its donor substrate to an 
acceptor substrate containing "N-acetyl-D-glucosamine (GlcNAc)." Examination of motif sequences in the amino acid 
sequence indicates that the linking mode between N -acetylgalactosamine and N-acetylglucosamine is a pi ,3 glycosidic 
linkage (see Example 2). 

Donor substrate specificity: 

*5 [0073] The above N-acetyl-D-galactosamine donor substrate encompasses sugar nucleotides having N-acetylga- 
lactosamine, such as uridine diphosphate-N-acetylgalactosamine (UDP-GalNAc), adenosine diphosphate-N-galactos- 
amine (ADP-GalNAc), guanosine diphosphate-N-acetylgalactosamine (GDP-GalNAc) and cytidine diphosphate-N- 
acetylgalactosamine (CDP-GalNAc). A typical donor substrate is UDP-GalNAc. 

[0074] Namely, the G34 enzyme protein of the present invention catalyzes a reaction of the following scheme: 



w 



20 



25 



UDP-GalNAc + GlcNAc-R UDP + GalNAc-p1 ,3-GlcNAc-R 

(wherein R represents, e.g., a glycoprotein, glycolipid, oligosaccharide or polysaccharide having the GlcNAc residue). 
Acceptor substrate specificity: 

[0075] An acceptor substrate of the above GalNAc is N-acetyl-D-glucosamine, typically an N-acetyl-D-glucosamine 
residue of glycoproteins, glycolipids, oligosaccharides or polysaccharides, etc. 
30 [0076] When using an oligosaccharide as an acceptor substrate, the human G34 protein obtained in Example 1 
described later (typically having a region covering amino acid 36 to the C-terminal end of SEQ ID NO: 2) shows trans- 
ferase activity toward Bz-p-GlcNAc, GlcNAc-p1 -4-GlcNAc-p-Bz, pNp-core2 (core2 = Gal-pi-3- (GlcNAc-p1-6) GalNAc- 
a-pNp; the same applying hereinafter), pNp-core3 (core3 = GlcNAc-p1 -3 GalNAc-a-pNp; the same applying hereinaf- 
ter) and pNp-core6 (core6 = GlcNAc-p1-6-GalNAc-a-pNp; the same applying hereinafter). Preferably, the human G34 
protein is free from transferase activity toward Bz-oe-GlcNAc and Gal-p1 -3 GlcNAc-p-pNp. Moreover, when the activity 
is compared between these substrates, the transferase activity is very high in transferring to pNp-core2 and Bz-p-Glc- 
NAc, particularly highest in transferring to pNp-core2. The transferase activity is relatively low in transferring to GlcNAc- 
pi-4-GlcNAc-p-Bz, pNp-core3 and pNp-core6. 

[0077] Likewise, the mouse G34 protein obtained in Example 4 described later (typically having an active region 
covering amino acid 35 to the C-terminal end of SEQ ID NO: 4) shows transferase activity toward Bz-p-GlcNAc, pNp- 
P-GIc, GicNAc-pl-4-GlcNAc-p-Bz, pNp-core2, pNp-core3 and pNp-core6. When the activity is compared between 
these substrates, the transferase activity is highest in transferring to Bz-p-GlcNAc, followed by core2-pNp, core6-pNp, 
core3-pNp, pNp-p-GIc and GlcNAc-p1-4-GlcNAc-p-Bz in the order named. 

[0078] As used herein, "GlcNAc" represents an N-acetyl-D-glucosamine residue, "GalNAc" represents an N-acetyl- 
D-galactosamine residue, "Glc" represents a glucosamine residue, "Bz" represents a benzyl group, "pNp" represents 
a p-nitrophenyl group, "oNp" represents a o-nitrophenyl group, and "-" represents a glycosidic linkage. Numbers in 
these formulae each represent the carbon number in the sugar ring where the above glycosidic linkage is present. 
Likewise, "a" and "P" represent anomers of the above glycosidic linkage at the 1 -position of the sugar ring. An anomer 
whose positional relationship with CH 2 OH or CH 3 at the 5-position is trans and cis is represented by "a" and "p", 
50 respectively. 

Optimum buffer and optimum pH (Table 3 and Figure 4): 



35 



40 
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55 



[0079] Examination of the human G34 protein indicates that the protein has the above catalytic effect in each of the 
following optimum buffers: MES (2-morpholinoethanesulfonic acid) buffer, sodium cacodylate buffer or HEPES (N-[2-hy- 
droxyethl]piperazine-N'-[2-ethanesulfonic acid]) buffer. 

[0080] The pH dependence of the activity in each buffer is as follows: in MES buffer, the activity is highest around a 
pH of at least 5.50 to 5.78 and second highest around pH 6.75; in sodium cacodylate buffer, the activity increases with 
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decrease in pH from around 6.2 to around 5.0 and is highest around pH 5.0 S while the activity also increases in a pH- 
dependent manner between around pH 6.2 and 7.0 and nearly plateaus around pH 7.4; and in HE PES buffer, the 
activity is highest around a pH of 7.4 to 7.5. Among them, HEPES buffer at a pH of about 7.4 to about 7.5 results in 
the strongest activity. In all the buffers, the activity is lower in a pH range of 6.2 to 6.6 than in other pH ranges. 

5 

Divalent ion requirement (Table 4 and Figure 5): 

[0081] The activity of the human G34 protein is enhanced in the presence of a divalent metal ion, particularly Mn 2+ , 
Co 2+ or Mg 2+ . The influence of each metal ion concentration on the activity is as follows: in the case of Mn 2+ and Co 2+ , 
10 the activity increases in a concentration-dependent manner up to around 5.0 nM and then nearly plateaus at higher 
concentrations, while in the case of Mg 2+ , the activity increases in a concentration-dependent manner up to around 
2.5 nM and then nearly plateaus at higher concentrations. However, the Mn 2 +- induced enhancement of the activity is 
completely eliminated in the presence of Cu 2+ . 

[0082] As described above, the G34 enzyme protein of the present invention can transfer a GalNAc residue to a 
is GlcNAc residue with (31 -3 glycosidic linkage under given enzymatic reaction conditions as mentioned above and is 
useful for such sugar chain synthesis or modification reactions targeted at glycoproteins, glycolipids, oligosaccharides 
or polysaccharides, etc. 

[0083] Secondly, having disclosed herein the amino acid sequences shown in SEQ ID NOs: 2 and 4 which are given 
as typical examples of the primary structure of the above enzyme protein, the present invention provides all proteins 

20 which can be produced on the basis of these amino acid sequences through genetic engineering procedures well 
known in the art (hereinafter also referred to as "mutated proteins" or "modified proteins"). Namely, according to com- 
mon knowledge in the art, the enzyme protein of the present invention is not limited only to a protein consisting of the 
amino acid sequence of SEQ ID NO: 2 or 4 estimated from the nucleotide sequence of each cloned nucleic acid, and 
is also intended to include, for example, a protein consisting of a non-full-length polypeptide having, e.g., a partial N- 

25 terminal deletion of the amino acid sequence, or a protein homologous to such an amino acid sequence, each of which 
has properties inherent to the protein, as illustrated below. 

[0084] First, the human G34 enzyme protein of the present invention may preferably have an amino acid sequence 
covering amino acid 189 to the C-terminal end of SEQ ID NO: 2, more preferably an amino acid sequence covering 
amino acid 36 to the C-terminal end as obtained in the Example section described later. Likewise, the mouse G34 
30 enzyme protein of the present invention may preferably have an amino acid sequence covering amino acid 35 to the 
C-terminal end of SEQ ID NO: 4. 

[0085] Moreover, in proteins usually having physiological activities equivalent to enzymes, it is well known that the 
physiological activities are maintained even when their amino acid sequences have substitution, deletion, insertion or 
addition of one or more amino acids. It is also known that among naturally-occurring proteins, there are mutated proteins 

35 which have gene mutations resulting from differences in the species of source organisms and/or differences in ecotype 
or which have one or more amino acid mutations resulting from the presence of closely resembling isozymes, etc. In 
light of this point, the protein of the present invention also encompasses mutated proteins which have an amino acid 
sequence with substitution, deletion, insertion or addition of one or more amino acids in each amino acid sequence 
shown in SEQ ID NO: 2 or 4 and which have the ability to transfer a GalNAc residue to a GlcNAc residue with (31-3 

40 glycosidic linkage under given enzymatic reaction conditions as mentioned above. Moreover, particularly preferred are 
modified proteins having amino acid sequences with substitution, deletion, insertion or addition of one or several amino 
acids in each amino acid sequence shown in SEQ ID NO: 2 or 4. 

[0086] The expression "one or more amino acids" found above means preferably 1 to 200 amino acids, more pref- 
erably 1 to 100 amino acids, even more preferably 1 to 50 amino acids, and most preferably 1 to 20 amino acids. In 

^5 general, in a case where amino acid substitution occurs as a result of site-specific mutagenesis, the number of amino 
acids which can be substituted while maintaining the activities inherent to the original protein is preferably 1 to 10. 
[0087] The modified protein of the present invention also includes those obtained by substitution between functionally 
equivalent amino acids. Namely, it is generally well known to those skilled in the art that recombinant proteins having 
a desired mutation(s) can be prepared by procedures involving introduction of substitution between functionally equiv- 

50 alent amino acids (e.g., replacement of one hydrophobic amino acid with another hydrophobic amino acid, replacement 
of one hydrophilic amino acid with another hydrophilic amino acid, replacement of one acidic amino acid with another 
acidic amino acid, or replacement of one basic amino acid with another basic amino acid). The modified proteins thus 
obtained often have the same properties as the original protein. In light of this point, modified proteins having such 
amino acid substitutions also fall within the scope of the present invention. 

55 [0088] Moreover, the modified protein of the present invention may be a glycoprotein having sugar chains attached 
to the polypeptide as long as it has such an amino acid sequence as defined above and has an enzymatic activity 
inherent to the intended enzyme. 

[0089] To identify the range of the homologous protein of the present invention, an identity search using GENETYX 
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software (Genetyx Corporation, Japan) is performed for the amino aeid sequence shown in SEQ ID NO: 2 or 4 of the 
present invention, indicating that the amino acid sequence shares 1 4% identity with a known pi ,4GalNAc transferase 
showing the highest homology (Non-patent Document 1 listed above) and also shares 30% identity with a known f}1 , 
3Gal transferase showing the highest homology (Non-patent Document 2 listed above). In light of these points, a 
5 preferred amino acid sequence for the homologous protein of the present invention preferably shares more than 30% 
identity, more preferably at least 40% identity, and particularly preferably at least 50% identity with the amino acid 
sequence shown in SEQ ID NO: 2 or 4. 

[0090] Likewise, the amino acid sequences shown in SEQ ID NOs: 2 and 4 share 88% identity with each other. In 
light of this point, a preferred amino acid sequence for the homologous protein of the present invention can be defined 
10 as sharing at least 88%, more preferably 90% identity with the amino acid sequence within SEQ ID NO: 2. 

[0091] The above GENETYX is genetic information processing software for nucleic acid/protein analysis and enables 
standard analyses of homology and multialignment, as well as signal peptide prediction, promoter site prediction and 
secondary structure prediction. The homology analysis program used herein employs the Lipman-Pearson method 
(Lipman, D.J. & Pearson, W.R., Science, 277, 1435-1441 (1985)) frequently used as a rapid and sensitive method. 
In the present invention, the percentage of identity may be determined by comparing sequence information using : e. 
g., the BLAST program described by Altschul et al. (Nucf. Acids. Res., 25. 3389-3402 (1997)) or the FASTA program 
described by Pearson et al. (Proc. Natl. Acad. Sci. USA, 2444-2448 (1988)). These programs are available on the 
Internet at the web site of the National Center for Biotechnology Information (NCBI) or the DNA Data Bank of Japan 
(DDBJ). The details of various conditions (parameters) for each identity search using each program are shown on 
20 these web sites, and default values are commonly used for these searches although part of the settings may be changed 
as appropriate. It is also possible to use other sequence comparison programs used by those skilled in the art. 
[0092] Thirdly, the isolated protein of the present invention may be administered as an immunogen to an animal to 
produce an antibody against the protein, as described later. Such an antibody may be used for immunoassays to 
measure and quantify the enzyme. Thus, the present invention is also useful in preparing such an immunogen. 
25 in light of this point, the protein of the present invention also includes a polypeptide fragment, mutant or fusion protein 
thereof, which contains an antigenic determinant or epitope for eliciting antibody formation. 

(4) Isolation and purification of the G34 enzyme protein of the present invention 

30 [0093] The enzyme protein of the present invention may be isolated and purified in the following manner. 

[0094] Recent studies have established genetic engineering procedures which involve culturing and growing a trans- 
formant and isolating and purifying a substance of interest from the resulting culture or grown transformant. The enzyme 
protein of the present invention may also be expressed (produced) , e.g. , by culturing in a nutrient medium a transformant 
containing an expression vector carrying the nucleic acid of the present invention. 

35 [0095] A nutrient medium used for transformant culturing preferably contains a carbon source, an inorganic nitrogen 
source or an organic nitrogen source required for host cell (transformant) growth. Examples of a carbon source include 
glucose, dextran, soluble starch, sucrose and methanol. Examples of an inorganic or organic nitrogen source include 
ammonium salts, nitrate salts, amino acids, corn steep liquor, peptone, casein, meat extracts, soybean meal and potato 
extracts. If desired, the medium may contain other nutrients such as inorganic salts (e.g., sodium chloride, calcium 

40 chloride, sodium dihydrogen phosphate, magnesium chloride), vitamins, and antibiotics (e.g., tetracycline, neomycin, 
ampicillin, kanamycin). Culturing may be accomplished in a manner known in the art. Culture conditions such as tem- 
perature, medium pH and culture period may be appropriately selected such that the protein according to the present 
invention is produced in a large quantity. 

[0096] The enzyme protein of the present invention may be obtained from the above culture or grown transformant 
45 as follows. Namely, in a case where a protein of interest is accumulated in host cells, the host cells may be collected 
by manipulations such as centrifugation or filtration, suspended in an appropriate buffer (e.g., Tris buffer, phosphate 
buffer, HEPES buffer or MES buffer at a concentration around 10 to 100 mM, the pH of which will vary from buffer to 
buffer, but desirably falls within the range of 5.0 to 9.0), and then crushed in a manner suitable for the host cells used, 
followed by centrifugation to obtain the contents of the host cells. On the other hand, in a case where a protein of 
50 interest is secreted from host cells, the host cells and the medium are separated from each other by manipulations 
such as centrifugation or filtration to obtain a culture filtrate. The crushed host cell solution or culture filtrate may be 
provided directly or may be treated by ammonium sulfate precipitation and dialysis before being provided for isolation 
and purification of the protein. 

[0097] Isolation and purification of a protein of interest may be accomplished in the following manner. Namely, in a 
55 case where the protein is labeled with a tag such as 6 x histidine, GST or maltose-binding protein, the isolation and 
purification may be accomplished by affinity chromatography suitable for each of the commonly used tags. On the 
other hand, in a case where the protein according to the present invention is produced without being labeled with such 
a tag, the isolation and purification may be accomplished, e.g., by ion exchange chromatography, which may further 
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be combined with gel filtration, hydrophobic chromatography, isoelectric chromatography, etc. 

[0098] Moreover, an expression vector may be constructed to facilitate isolation and purification. In particular, the 
isolation and purification is facilitated if an expression vector is constructed to express a fusion protein of a polypeptide 
having an enzymatic activity with a labeling peptide and the enzyme protein is prepared in a genetic engineering 
5 manner. An example of the above identification peptide is a peptide having the function of facilitating secretion, sepa- 
ration, purification or detection of the enzyme according to the present invention from the grown transformant by al- 
lowing the enzyme to be expressed as a fusion protein in which the identification peptide is attached to a polypeptide 
having an enzymatic activity when the enzyme according to the present invention is prepared by gene recombination 
techniques. 

10 [0099] Examples of such an identification peptide include peptides such as a signal peptide (a peptide composed of 
15 to 30 amino acid residues, which is present at the N-terminal end of many proteins and is functional in cells for 
protein selection in the intracellular membrane permeation mechanism; e.g., OmpA, OmpT, Dsb), protein kinase A, 
Protein A (a protein with a molecular weight of about 42,000, which is a component constituting the Staphylococcus 
aureus cell wall), glutathione S transferase, His tag (a sequence consisting of 6 to 10 histidine residues in series), myc 

*5 tag (a 13 amino acid sequence derived from cMyc protein), FLAG peptide (an analysis marker composed of 8 amino 
acid residues), T7 tag (composed of the first 11 amino acid residues of the gene 10 protein), S tag (composed of 
pancreas RNase A-derived 15 amino acid residues), HSV tag, pelB (a 22 amino acid sequence from the E. coli external 
membrane protein pelB), HA tag (composed of hemagglutinin-derived 10 amino acid residues), Trx tag (thioredoxin 
sequence), CBP tag (calmodulin-binding peptide), CBD tag (cellulose-binding domain), CBR tag (collagen-binding 

20 domain), p-lac/blu (p-lactamase), (3-gal (p-galactosidase), luc (luciferase), HP-Thio (His-patch thioredoxin), HSP (heat 
shock peptide), Ln*y (laminin y-peptide), Fn (fibronectin partial peptide), GFP (green fluorescent peptide), YFP (yellow 
fluorescent peptide), CFP (cyan fluorescent peptide), BFP (blue fluorescent peptide), DsRed, DsRed2 (red fluorescent 
peptides), MBP (maltose-binding peptide), LacZ (lactose operator), IgG (immunoglobulin G), avidin and Protein G, any 
of which can be used. 

25 [0100] Among them, particularly preferred are the signal peptide, protein kinase A, Protein A, glutathione S trans- 
ferase, His tag, myc tag, FLAG peptide, T7 tag, S tag, HSV tag, pelB and HA tag because they facilitate expression 
and purification of the enzyme according to the present invention through genetic engineering procedures. In particular, 
it is preferable to obtain the enzyme as a fusion protein with FLAG peptide (Asp-Tyr-Lys-Asp-Asp-Asp-Asp-Lys) be- 
cause it is very easy to handle. The above FLAG peptide is extremely antigenic and provides an epitope capable of 

30 reversible binding of a specific monoclonal antibody, thus enabling rapid. assay and easy purification of the expressed 
recombinant protein. A mouse hybridoma called 4E11 produces a monoclonal antibody which binds to FLAG peptide 
in the presence of a certain divalent metal cation, as described in United States Patent No. 5,011 ,912 (incorporated 
herein by reference). A 4E1 1 hybridoma cell line has been deposited under Accession No. HB 9259 with the American 
Type Culture Collection. The monoclonal antibody binding to FLAG peptide is available from Eastman Kodak Co., 

35 Scientific Imaging Systems Division, New Haven, Connecticut. 

[0101] p FLAG-CM V-1 (SIGMA) can be presented as an example of a basic vector which can be expressed in mam- 
malian cells and enables obtaining the enzyme protein of the present invention as a fusion protein with the above FLAG 
peptide. Likewise, examples of a vector which can be expressed in insect cells include, but are not limited to, pFBIF 
(i.e., a vector prepared by integrating the region encoding FLAG peptide into pFastBac (Invitrogen Corporation); see 

40 the Example section described later). Those skilled in the art will be able to select an appropriate basic vector depending 
on, e.g., the host cell, restriction enzyme and identification peptide to be used for expression of the enzyme. 

(5) Antibody recognizing the G34 enzyme protein of the present invention 

45 [0102] The present invention provides an antibody which is immunoreactive to the G34 enzyme protein. Such an 
antibody is capable of specifically binding to the enzyme protein via the antigen-binding site of the antibody (as opposed 
to non-specific binding). More specifically, a protein having the amino acid sequence of SEQ ID NO: 2 or 4 or a fragment, 
mutant or fusion protein thereof may be used as an immunogen for producing an antibody immunoreactive to each of 
them. 

so [0103] More specifically, such a protein, fragment, mutant or fusion protein contains an antigenic determinant or 
epitope for eliciting antibody formation. These antigenic determinant and epitope may be either linear or conformational 
(discontinuous). The antigenic determinant or epitope can be identified by any technique known in the art. Thus, the 
present invention also relates to an antigenic epitope of the G34 enzyme protein. Such an epitope is useful in preparing 
an antibody, particularly a monoclonal antibody, as described in more detail below. 

55 [0104] The epitope of the present invention can be used in assays and as a research reagent for purifying a specific 
binding antibody from materials such as polyclonal sera or supernatants from cultured hybridomas. Such an epitope 
or a variant thereof may be prepared using techniques known in the art (e.g., solid phase synthesis, chemical or en- 
zymatic cleavage of a protein) or using recombinant DNA technology. 
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[0105] The enzyme protein of the present invention may be used to derive any embodiment of an antibody. If the 
entire or partial polypeptide of or an epitope of the protein has been isolated, both polyclonal and monoclonal antibodies 
can be prepared using conventional techniques. See, e.g., Kennet et al. (eds.), Monoclonal Antibodies, Hybridomas: 
A New Dimension in Biological Analyses, Plenum Press, New York, 1980. 

5 [0106] The present invention also provides a hybridoma cell line producing a monoclonal antibody specific to the 
G34 enzyme protein. Such a hybridoma can be produced and identified by conventional techniques. One method for 
producing such a hybridoma cell line involves immunizing an animal with the enzyme protein of the present invention, 
• collecting spleen celts from the immunized animal, fusing the spleen cells with a myeloma cell line to give hybridoma 
cells, and identifying a hybridoma cell line which produces a monoclonal antibody binding to the enzyme. The resulting 

10 monoclonal antibody may be collected by conventional techniques. 

[0107] The monoclonal antibody of the present invention encompasses chimeric antibodies, for example, humanized 
mouse monoclonal antibodies. Such a humanized antibody is advantageous in reducing immunogenicity when admin- 
istered to a human subject. 

[0108] The present invention also provides an antigen-binding fragment of the above antibody. Examples of an an- 
15 tigen-binding fragment which can be produced by conventional techniques include, but are not limited to, Fab and F 
(ab') 2 fragments. The present invention also provides an antibody fragment and derivative which can be produced by 
genetic engineering techniques. 

[0109] The antibody of the present invention can be used in assays to detect the presence of the G34 enzyme protein 
of the present invention or a polypeptide fragment thereof, either in vitro or in vivo. The antibody of the present invention 
20 may also be used in purifying the G34 enzyme protein or a polypeptide fragment thereof by immunoaffinity chroma- 
tography. 

[0110] Moreover, the antibody of the present invention may also be provided as a blocking antibody capable of 
blocking the binding of the above glycosyltransferase protein to its binding partner (e.g., acceptor substrate), thus 
inhibiting the enzyme's biological activity resulting from such binding. Such a blocking antibody may be identified using 
25 any suitable assay procedure, for example, by testing the antibody for the ability to inhibit the binding of the protein to 
certain cells expressing an acceptor substrate. 

[0111] Alternatively, the blocking antibody may also be identified in assays for the ability to inhibit a biological effect 
resulting from the enzyme protein bound to its binding partner in target cells. Such an antibody may be used in an in 
vitro procedure or administered in vivoXo inhibit a biological activity mediated by the entity that generated the antibody. 
30 Thus, the present invention also provides an antibody for treating disorders which are caused or exacerbated by either 
direct or indirect interaction between the G34 enzyme protein and its binding partner. Such therapy will involve in vivo 
administration of the blocking antibody to a mammal in an amount effective for inhibiting a binding partner-mediated 
biological activity. For use in such therapy, monoclonal antibodies are preferred and, in one embodiment, an antigen- 
binding antibody fragment is used. 



35 



(6) Nucleic acid of the present invention for canceration assay 



[0112] In response to the discovery of the above G34 enzyme protein . the inventors of the present invention have 
confirmed that mRNA encoding this protein is widely found in cancerous tissues and cell lines and that the expression 

40 level of the mRNA is significantly increased particularly in cancerous tissues. Thus, the G34 nucleic acid is useful as 
a tumor marker that is useful for, e.g. , cancer diagnosis targeted at biological samples containing transcription products. 
In this aspect, the present invention provides a nucleic acid for measurement, which is capable of hybridizing under 
stringent conditions to a nucleic acid defined by the nucleotide sequence shown in SEQ ID NO: 1 or 3. 
[0113] In one embodiment, the nucleic acid for measurement of the present invention is a primer or probe targeting 

45 the G34 nucleic acid in a biological sample and having a nucleotide sequence selected from the nucleotide sequence 
of SEQ ID NO: 1 or 3. In particular, since the nucleotide sequence of SEQ ID NO: 1 is derived from mRNA encoding 
a structural gene and contains the entire open reading frame (ORF) of the G34 gene, full-length or nearly full-length 
sequences of SEQ ID NO: 1 or 3 are usually found in transcription products from a biological sample. In light of this 
point, the primer or probe according to the present invention has a desired partial sequence selected from each nu- 

50 cleotide sequence of SEQ ID NO: 1 or 3 (either homologous or complementary to the selected sequence depending 
on the intended use) and hence can be provided as a nucleic acid capable of specifically hybridizing to the target 
sequence. 

[01 1 4] Typical examples of such a primer or probe include a native DNA fragment derived from a nucleic acid having 
at least a part of the nucleotide sequence shown in SEQ ID NO: 1 or 3, a DNA fragment synthesized to have at least 
55 a part of the nucleotide sequence shown in SEQ ID NO: 1 or 3, or complementary strands of these fragments. 

[01 1 5] Such a primer or probe as mentioned above may be used to detect and/or quantify the target nucleic acid in 
a biological sample, as described later. Since sequences on the genome can also be targeted, the nucleic acid of the 
present invention may also be used as an antisense primer for medical research or gene therapy. 
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(A) Probe of the present invention 

[0116] In a preferred embodiment, the nucleic acid for measurement of the present invention is a probe targeting a 
nucleic acid having the nucleotide sequence of SEQ ID NO: 1 or 3 or a complementary strand of at least one of them. 
The probe contains an oligonucleotide composed of at least a dozen nucleotides, preferably at least 15 nucleotides, 
preferably at least 1 7 nucleotides, and more preferably at least 20 nucleotides selected from the nucleotide sequences 
of SEQ ID NOs: 1 and 3, or a complementary strand of the oligonucleotide, or full-length cDNA of its ORF region or a 
complementary strand of the cDNA. 

[0117] In a case where the nucleic acid for measurement of the present invention is provided as an oligonucleotide 
probe, it is understood that a length of a dozen nucleotides (e.g., 15 nucleotides, preferably 17 nucleotides) may be 
sufficient for the nucleic acid to specifically hybridize under stringent conditions to its target nucleic acid. Namely, those 
skilled in the art will be able to select an appropriate partial sequence composed of at least 15 to 20 nucleotides from 
the nucleotide sequence of SEQ ID NO: 1 or 3 in accordance with known various strategies for oligonucleotide probe 
design. In this case, the amino acid sequence information shown in SEQ ID NO: 2 or 4 is helpful in selecting a unique 
sequence that may be suitable as a probe. 

[01 1 8] Likewise, in the case of a cDN A probe, for example, a probe with a high molecular weight is generally difficult 
to handle when used as a reagent or diagnostic agent for medical research. In light of this point, the probe of the 
present invention intended for medical research includes a nucleic acid composed of 50 to 500 nucleotides, more 
preferably 60 to 300 nucleotides selected from each nucleotide sequence of SEQ ID NO: 1 or 3. 
[0119] The term "stringent conditions" found above means conditions of moderate or high stringency as explained 
earlier. Those skilled in the art will be able to readily determine and achieve conditions of moderate or high stringency 
suitable for the selected probe, on the basis of common knowledge and empirical rule about known procedures for 
various probe designs and hybridization conditions. 

[0120] Although depending on, e.g., the nucleotide length to be selected and the hybridization conditions to be ap- 
plied, a relatively short oligonucleotide probe can serve as a probe even when it has a mismatch of one or several 
nucleotides, particularly one or two nucleotides, in comparison with the nucleotide sequence of SEQ ID NO: 1 or 3. 
Likewise, a relatively long cDNA probe can also serve as a probe even when it has a mismatch of 50% or less, preferably 
20% or less, in comparison with the nucleotide sequence of SEQ ID NO: 1 or a nucleotide sequence complementary 
thereto. 

[0121] The probe of the present invention thus designed can be used as a labeled probe having a label such as a 
fluorescent label, a radioactive label or a biotin label , in orderto detect or confirm a hybrid formed with a target sequence 
inG34. 

[0122] For example, the labeled probe of the present invention may be used for confirmation or quantification of PCR 
amplification products from the G34 nucleic acid. In this case, it is preferable to use a probe targeting the nucleotide 
sequence located in a region between a pair of primer sequences used for PCR. An example of such a probe may be 
an oligonucleotide consisting of the nucleotide sequence shown in SEQ ID NO: 16 (corresponding to a complementary 
strand against nucleotides 525 to 556 in SEQ ID NO: 1) (see Example 3). 

[0123] The probe of the present invention may be included in a kit such as a diagnostic DNA probe kit or may be 
immobilized on a chip such as a DNA microarray chip. 

(B) Primers of the present invention 

[0124] In a preferred embodiment, the primers obtained from the nucleic acid for the canceration assay of the present 
invention are oligonucleotide primers. To prepare oligonucleotide primers, two regions may be selected from the ORF 
region of the nucleotide sequence shown in SEQ ID NO: 1 or 3 in such a manner as to satisfy the following conditions: 

a) the length of each region is at least several tens of nucleotides, particularly at least 15 nucleotides, preferably 
at least 17 nucleotides, more preferably at least 20 nucleotides, and at most 50 nucleotides; and 

b) the G+C content in each region is 40% to 70%. 

[0125] In actual fact, oligonucleotide primers may be prepared as single-stranded DNAs having nucleotide sequenc- 
es identical or complementary to the two regions thus selected, or may be prepared as single-stranded DNAs modified 
not to lose the binding specificity to these nucleotide sequences. Although each primer of the present invention pref- 
erably has a sequence that is completely complementary to the selected target sequence, a mismatch of one or two 
nucleotides may be permitted. 

[01 26] Examples of the pair of primers according to the present invention include a pair of oligonucleotides consisting 
of SEQ ID NOs: 14 and 15 (corresponding to complementary strands against nucleotides 481-501 and 562-581 in 
SEQ ID NO: 1, respectively) for human G34, and a pair of oligonucleotides consisting of SEQ ID NOs: 17 and 18 
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(corresponding to complementary strands against nucleotides 481-501 and 562-581 in SEQ ID NO: 3, respectively) 
for mouse G34. 

(7) Canceration assay according to the present invention 

[01 27] As described earlier, the G34 nucleic acid of the present invention was confirmed to show a significant increase 
in the expression level (i.e., transcription level of the gene from the genome into mRNA) in a cancerous biological 
sample when compared to a normal biological sample. The G34 nucleic acid of the present invention was demonstrated 
to be useful at least in a canceration assay for large intestine (colon) cancer or lung cancer (see Example 3). 
[0128] According to detailed embodiments of the canceration assay of the present invention, transcription products 
extracted from a biological sample or a nucleic acid library derived therefrom may be used as a test sample and 
measured for the amount of the G34 nucleic acid (typically the amount of its mRNA) using the above probe or primer 
to determine whether the measured value is significantly higher than that of a normal biological sample. In this case, 
if the measured value of the test biological sample is significantly higher than the reference value of the normal biological 
sample, the test biological sample is determined as being cancerous or having a high grade of malignancy. 
[0129] In the canceration assay of the present invention, the reference value for a normal biological sample used as 
a control may be a value measured for a control site (typically a normal site) in the same tissue of the same patient or 
may be a value normalized from known data obtained in a control site, e.g., the mean value of mRNA levels in normal 
tissues. 

[0130] According to the measurement of expression levels using the nucleic acid for measurement of the present 
invention, human G34 is found to be expressed at a high level in the brain, skeletal muscle, pancreas, adrenal gland, 
testis and prostate when measured in normal sites, and there is also significant expression in other sites, although at 
a relatively low level. This indicates that human G34 expression is widely found over various tissues and that the 
expression level of human G34 is significantly increased even in tissues with a relatively low expression level, such 
as large intestine (colon) and lung tissues. Once these data have been provided, those skilled in the art will recognize 
the actual utility and effect of the nucleic acid for measurement of the present invention. 

[01 31] In this assay, whether the measured value for a test sample is significantly higher than that of a normal sample 
may be determined by the criteria that are set depending on the accuracy (positive rate) required for the assay or the 
grade of malignancy to be determined. The criteria may be freely set depending on the intended purpose; for example, 
the reference value to be determined as positive may be set to a lower value for the purpose of detecting tissues with 
a high grade of malignancy or may be set to a higher value for the purpose of comprehensively detecting test samples 
with signs or risk of canceration. 

[0132] Examples will be given below of hybridization and PCR assays to illustrate the canceration assay of the 
present invention. 

(A) Hybridization assay 

[01 33] Embodiments of this assay include those using a probe obtained from the nucleic acid of the present invention , 
e.g., methods using various hybridization assays well known to those skilled in the art, exemplified by Southern blotting, 
Northern blotting, dot blotting or colony hybridization. In the case of requiring amplification and/or quantification of the 
detected signal, these methods may further be combined with immunoassay. 

[0134] According to typical hybridization assays, a nucleic acid extracted from a biological sample or an amplification 
product thereof may be immobilized on a solid phase and hybridized with a labeled probe under stringent conditions. 
After washing, the label attached to the solid phase may be measured. 

[0135] Extraction and purification of transcription products from a biological sample may be accomplished by using 
any method known to those skilled in the art. 

(B) PCR assay 

[0136] In a preferred embodiment, the canceration assay of the present invention includes PCR methods based on 
nucleic acid amplification using the primers of the present invention. The details of PCR are as explained earlier. In 
this subsection, a detailed PCR-based embodiment of this assay will be explained. 

[0137] G34 mRNA in transcription products to be assayed can be amplified by PCR using a pair of primers located 
at both ends of a given region selected from the nucleotide sequence of G34. In this step, rf even trace amounts of 
G34 nucleic acid fragments are present in an analyte, these fragments will serve as templates to replicate and amplify 
the nucleic acid region between the primer pair. After repeating a given number of PCR cycles, the nucleic acid frag- 
ments serving as templates are each amplified to a desired concentration. Under the same amplification conditions, 
the amplification product will be obtained in proportion to the amount of G34 mRNA present in the analyte. Then, the 
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above probe or the like targeting the amplified region may be used to confirm whether the amplification product is the 
nucleic acid of interest and also quantify the same. Likewise, the nucleic acid in a normal tissue may also be measured 
in the same manner. In this case, a nucleic acid of a gene that is widely and usually present in the same tissue or the 
like (e.g., a nucleic acid encoding glyceraldehyde-3-phosphate dehydrogenase (GAPDH) or (3-actin) may be used as 

5 a control to remove variations among individuals. The measured value for the transcription level of G34 is provided for 
comparison to assay the presence of canceration or the grade of malignancy, as described above. 
[01 38] A nucleic acid sample provided for PCR methods may be either total mRNA extracted from a biological sample 
(e.g., a test tissue or cell) or total cDNA reverse transcribed from mRNA. In a case where mRNA is amplified, the 
NASBA method (3SR method, TMA method) using the primer pair mentioned above may be employed. Since the 

10 NASBA method per se is well known and kits for this method are commercially available, the method may be readily 
accomplished by using the primer pair of the present invention. 

[0139] To detect or quantify the above amplification product, the reaction solution after amplification may be electro- 
phoresed and the resulting bands may be stained with ethidium bromide orthe like, or alternatively, the electrophoresed 
amplification product may be immobilized onto a solid phase (e.g. , a nylon membrane), hybridized with a labeled probe 
15 specifically hybridizing to a test nucleic acid (e.g., a probe having the nucleotide sequence of SEQ ID NO: 16) and 
washed, followed by detection of the label. 

[0140] Examples of PCR methods preferred for this assay include quantitative PCR, especially kinetic RT-PCR or 
quantitative real-time PCR. in particular, quantitative real-time RT-PCR targeted at mRNA libraries is preferred in view 
that it allows direct purification of a target to be measured from a biological sample and directly reflects the transcription 
20 level. However, the nucleic acid quantification in this assay is not limited to quantitative PCR. Other known quantitative 
DNA assays (e.g., Northern blotting, dot blotting, DNA microarray) using the above probe may also be applied to the 
PCR products. 

[0141] Moreover, when performed using a quencher fluorescent dye and a reporter fluorescent dye, quantitative 
RT-PCR also enables quantification of a target nucleic acid in an analyte. In particular, it may be readily performed 
25 since kits for quantitative RT-PCR are commercially available. Moreover, a target nucleic acid may also be semi- 
quantified based on the intensity of the corresponding electrophoretic band. 

(C) Assay for therapeutic effect on cancer 

30 [0142] Other embodiments of the canceration assay of the present invention include an assay for determining the 
effect of curing or alleviating cancer. For example, targets of this assay include all treatments such as administration 
of an anticancer agent and radiation therapy, and targets of these treatments include in vitro cancer cells or cancer 
tissues derived from cancer patients or experimental animal models for carcinogenesis. 

[0143] According to this assay, in a case where a biological sample is subjected to a certain treatment, it is possible 
35 to know the therapeutic effect of the treatment on cancer by determining whether the transcription level of the G34 
nucleic acid in the biological sample is reduced due to the treatment. This assay is not limited to a determination 
whether the transcription level is reduced, and the result may also be evaluated as effective when an increase in the 
transcription level is significantly prevented. The transcription level may not only be compared with that of an untreated 
tissue, but also traced over time after the treatment. 
40 [0144] The assay of the present invention for therapeutic effect on cancer includes, for example, a determination 
whether a candidate substance for an anticancer agent is effective for cancerous tissues, whether resistance is devel- 
oped to an anticancer agent in cancer patients receiving the agent, or whether a candidate substance for an anticancer 
agent is effective for diseased tissues orthe like in experimental animal models. Test tissues from experimental animal 
models are not limited to in vitro samples, and also include in vivo or ex vivo samples. 

45 

(8) Creation of genetically engineered animal 

[0145] As described earlier, the inventors of the present invention have identified the presence of mouse G34 and 
its nucleic acid sequence (SEQ ID NO: 3). The present invention also relates to a means for expression and functional 
50 analysis of G34 at the animal level on the basis of various gene conversion techniques using fertilized eggs or ES 
cells, typically relates to creating transgenic animals into which the G34 gene is introduced and knockout mice which 
are deficient in mouse G34, etc. 

[0146] For example, the creation of knockout mice may be accomplished in accordance with routine techniques in 
the art (see, e.g., Newest Technique for Gene Targeting, edited by Takeshi Yagi, Yodosha Co., Ltd., Japan; Gene 
55 targeting, translated and edited by Tetsuo Noda, Medical Science International, Ltd., Japan). Namely, those skilled in 
the art will be able to obtain G34 homologous recombinant ES cells in accordance with known gene targeting techniques 
using sequence information of the mouse G34 nucleic acid disclosed herein, thus creating G34 knockout mice using 
these cells (see Example 7). 
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[0147] Recently, a method has been developed to prevent gene expression by small interfering RNA (T.R. Brum- 
melkamp et aL, Science, 296, 550-553 (2002)); it is also possible to create G34 knockout mice in accordance with 
such a known method. 

[0148] The provision of G34 knockout mice will be helpful in elucidating the involvement of the G34 gene in certain 
5 vital phenomena, i.e., information on redundancy of the gene, the relationship between deficiency of the gene and 
phenotype at the animal level (including any type of abnormality affecting motor, mental and sensory functions), as 
well as functions of the gene during the animal life cycle including development, growth and ageing. More specifically, 
the knockout mice thus obtained may be used to detect a carrier of sugar chains synthesized by G34 and mG34 and 
to examine their relationship with physiological functions or diseases, etc. For example, glycoproteins and glycolipids 
10 may be extracted from each tissue derived from the knockout mice and compared with those of wild-type mice by 
techniques such as proteomics (e.g., two-dimensional electrophoresis, two-dimensional thin-layer chromatography, 
mass spectrometry) to identify a carrier of the synthesized sugar chains. Moreover, the relationship with physiological 
functions or diseases may be estimated by comparing phenotypes (e.g., fetal formation, growth process, spontaneous 
behavior) between knockout mice and wild-type mice. 

15 

Definitions of terms 



[0149] As used herein to describe the transcription level of a nucleic acid, the term "measured value" or "expression 
level" refers to the amount of the nucleic acid present in transcription products from a fixed amount of a biological 
sample, i.e., the concentration of the nucleic acid. Moreover, since the assay of the present invention relies on the 
comparison of such measured values, even when a nucleic acid is amplified, e.g., by PCR for the purpose of quanti- 
fication or even when signals from a probe label are amplified, these amplified values may also be provided for relative 
comparison. Thus, the "measured value for a nucleic acid" can also be understood as the amount of the nucleic acid 
after amplification or the signal level after amplification. 

[0150] As used herein, the term "target nucleic acid" or "the nucleic acid" encompasses all types of nucleic acids, 
regardless of in vivo or in vitro, including of course G34 mRNA, as well as those obtained using the mRNA as a template! 
It should be noted that the term "nucleotide sequence" used herein also includes a complementary sequence thereof, 
unless otherwise specified. 

[0151] As used herein, the term "biological sample" refers to an organ, tissue or cell, as well as an experimental 
animal-derived organ, tissue, cell or the like, preferably refers to a tissue or cell. Examples of such a tissue include the 
brain, fetal brain, cerebellum, medulla oblongata, submandibular gland, thyroid gland, trachea, lung, heart, skeletal 
muscle, esophagus, duodenum, small intestine, large intestine (colon), rectum, colon, liver, fetal liver, pancreas, kidney, 
adrenal gland, thymus, bone marrow, spleen, testis, prostate, mammary gland, uterus and placenta, with the large 
intestine (colon) and lung being more preferred. 

[0152] As used herein, the term "measure", "measurement" or "assay" encompasses all of detection, amplification, 
quantification and semi-quantification . In particular, the assay according to the present invention relates to a canceration 
assay for a biological sample, as described above, and hence can be applied to, e.g., cancer diagnosis and treatment 
in the medical field. The term "canceration assay" used herein includes an assay as to whether a biological sample 
becomes cancer, as well as an assay as to whether the grade of malignancy is high. The term "cancer used herein 
typically encompasses malignant tumors in general and also includes disease conditions caused by the malignant 
tumors. Thus, targets of the assay according to the present invention include, but are not necessarily limited to, neu- 
roblastoma, glioma, lung cancer, esophageal cancer, gastric cancer, pancreatic cancer, liver cancer, kidney cancer, 
duodenal cancer, small intestine cancer, large intestine (colon) cancer, rectal cancer, colon cancer and leukemia, with 
large intestine (colon) cancer and lung cancer being preferred. 

[0153] The present invention will now be illustrated in more detail by way of the following examples. 
[EXAMPLES] 



Example 1 : Cloning and expression of human G34 gene, as well as purification of the expressed protein 

[0154] (53 galactosy (transferase 6 (P3GalT6) was used as a query for a BLAST search to thereby find a nucleic acid 
sequence with homology (SEQ ID NO: 1). The open reading frame (ORF) estimated from the nucleic acid sequence 
is composed of 1503 bp, i.e., 500 amino acids (SEQ ID NO: 2) when calculated as an amino acid sequence. The 
product encoded by these nucleic acid and amino acid sequences was designated human G34. 
[0155] The amino acid sequence of G34 has a hydrophobic amino acid region characteristic of glycosyltransferases 
at its N-terminal end and shares a homology of 47% (nucleic acid sequence) and 28% (amino acid sequence) with the 
above P3GalT6. The amino acid sequence of G34 also retains all of the three motifs conserved in the p3GalT family. 
[0156] In this example, G34 was not only confirmed for its expression in mammalian cells, but also allowed to be 
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expressed in insect cells for further examination of its activity. 

[0157] For activity confirmation, it would be sufficient to express at least an active region covering amino acid 189 
to the C-terminal end of SEQ ID NO: 1 , which is relatively homologous to (33GalT6. In this example, however, an active 
region covering amino acid 36 to the C-terminal end was attempted to be expressed. 

5 

Confirmation of human G34 gene expression in mammalian cells 

[0158] The active region covering amino acid 36 to the C-terminal end of G34 was genetically introduced into a 
mammalian cell line expression vector pFLAG-CMV3 using a FLAG Protein Expression system (Sigrna-Aldrich Cor- 

10 poration). Since pFLAG-CMV3 has a multicloning site, a gene of interest can be introduced into p FLAG-CM V3 when 
the gene and pFLAG-CMV3 are treated with restriction enzymes and then subjected to ligation reaction. 
[01 59] Kidney-derived cDNA (Clontech, Marathon -ready cDNA) was used as a template and subjected to PCR using 
a 5'-primer (G34-CMV-F1 ; SEQ ID NO: 5) and a S'-primer (G34-CMV-R1 ; SEQ ID NO: 6) to obtain a DNA fragment of 
interest. PCR was performed under conditions of 25 cycles of 98°C for 10 seconds, 55°C for 30 seconds, and 72°C 

is for 2 minutes. The PCR product was then electrophoresed on an agarose gel and isolated in a standard manner after 
gel excision. This PCR product has restriction enzyme sites Hindlll and BamHI at the 5' and 3' sides, respectively. 
[0160] After this DNA fragment and pFLAG-CMV3 were each treated with restriction enzymes Hindlll and BamHI, 
the reaction solutions were mixed together and subjected to ligation reaction, so that the DNA fragment was introduced 
into pFLAG-CMV3. The reaction solution was purified by ethanol precipitation and then mixed with competent cells (E. 

20 coli DH5ot). After heat shock treatment (42°C, 30 seconds), the cells were seeded on ampicillin-containing LB agar 
medium. 

[01 61 ] On the next day, the resulting colonies were confirmed by direct PC R for the DNA of interest. For more reliable 
results, after sequencing to confirm the DNA sequence, the vector (pFLAG-CMV3-G34A) was extracted and purified. 
[0162] Human kidney cell-derived cell line 293T cells (2 x 10 6 ) were suspended in 10 ml antibiotic-free DM EM 

25 medium (Invitrogen Corporation) supplemented with 1 0% fetal bovine serum, seeded in a 1 0 cm dish and cultured for 
1 6 hours at 37°C in a C0 2 incubator. pFLAG-CMV3-G34A (20 ng) and Lipofectamin 2000 (30 uj, Invitrogen Corporation) 
were each mixed with 1.5 ml OPTI-MEM (Invitrogen Corporation) and incubated at room temperature for 5 minutes. 
These two solutions were further mixed gently and incubated at room temperature for 20 minutes. This mixed solution 
was added dropwise to the dish and cultured for 48 hours at 37°C in a C0 2 incubator. 

30 [0163] The supernatant (10 ml) was mixed with NaN 3 (0.05%), NaCI (150 mM), CaCI 2 (2 mM) and anti-FLAG-M1 
resin (100 uJ, SIGMA), followed by overnight stirring at 4°C. On the next day, the supernatant was centrifuged (3000 
rpm, 5 minutes, 4°C) to collect a pellet fraction. After addition of 2 mM, CaCI 2 -TBS (900 jxl), centrifugation was repeated 
(2000 rpm, 5 minutes, 4°C) and the resulting pellet was suspended in 200 uJ of 1 mM CaCI 2 -TBS for use as a sample 
for activity measurement (G34 enzyme solution). A part of this sample was electrophoresed by SDS-PAGE and Western 

35 blotted using anti-FLAG M2-peroxidase (SIGMA) to confirm the expression of the G34 protein of interest. 

[0164] As a result, a band was detected at a position of about 60 kDa, thus confirming the expression of the G34 
protein. 

Insertion of human G34 gene into insect cell expression vector 

40 " ' ~~~ 

[0165] The active region covering amino acid 36 to the C-terminal end of G34 was integrated into pFastBac (Invit- 
rogen Corporation) in a GATEWAY system (Invitrogen Corporation). Moreover, a Bac-to-Bac system (Invitrogen Cor- 
poration) was also used to construct a bacmid. 

45 (1 ) Creation of entry clone 

[0166] Kidney-derived cDNA (Clontech, Marathon- ready cDNA) was used as a template and subjected to PCR using 
a 5'-primer (G34-GW-F1 ; SEQ ID NO: 7) and a 3'-primer (G34-GW-R1 ; SEQ ID NO: 8) to obtain a DNA fragment of 
interest. PCR was performed under conditions of 25 cycles of 98°C for 10 seconds, 55°C for 30 seconds, and 72°C 
so for 2 minutes. The PCR product was then electrophoresed on an agarose gel and isolated in a standard manner after 
gel excision. 

[0167] This product was integrated into pDONR201 (Invitrogen Corporation) through BP clonase reaction to create 
an "entry clone." The reaction was accomplished by incubating the DNA fragment of interest (5 llI), pDONR201 (1 uJ, 
150 ng), reaction buffer (2 uJ) and BP clonase mix (2 uJ) at 25°C for 1 hour. The reaction was stopped by addition of 
55 proteinase K (1 u.l) and incubation at 37°C for 10 minutes. The above reaction solution (1 uJ) was then mixed with 100 
u.l competent cells (E. coli DH5a, TOYOBO). After heat shock treatment, the cells were seeded in a kan amy cin -con- 
taining LB plate. 

[0168] On the next day, colonies were collected and confirmed by direct PCR for the DNA of interest. For more 
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reliable results, after sequencing to confirm the DNA sequence, the vector (pDONR-G34A) was extracted and purified. 

(2) Creation of expression clone 

[0169] At both sides of the insertion site, the above entry clone has attL recombination sites for excision of lambda 
phage from E. coll. When the entry clone is mixed with LR clonase (a mixture of lambda phage recombination enzymes 
Int, IHF and Xis) and a destination vector, the insertion site is transferred to the destination vector to give an expression 
clone. Detailed steps are as shown below. 

[0170] First, the entry clone (1 u.l), pFBIF (0.5 uJ, 75 ng), LR reaction buffer (2 uJ), TE (4.5 uJ) and LR clonase mix (2 
u.l) were reacted at 25°C for 1 hour. The reaction was stopped by addition of proteinase K (1 u.l) and incubation at 37°C 
for 10 minutes (this recombination reaction results in pFBIF-G34A). pFBIF is a pFastBad vector modified to have a 
IgK signal sequence (SEQ ID NO: 9) and a FLAG peptide for purification (SEQ ID NO: 10). The Igic signal sequence 
is inserted for the purpose of converting the expressed protein into a secretion form, while the FLAG peptide is inserted 
for the purpose of purification. To insert the FLAG peptide, a DNA fragment obtained from OT3 (SEQ ID NO: 11) as a 
template using primers OT20 (SEQ ID NO: 12) and OT21 (SEQ ID NO: 13) was inserted with Bam H1 and Eco R1 . 
Further, to insert a Gateway sequence, a Gateway Vector Conversion system (Invitrogen Corporation) was used to 
introduce a Conversion cassette. 

[0171] Subsequently, the whole volume of the above mixed solution (11 u.l) was mixed with 100 uJ competent cells 
(E. coli DH5oc). After heat shock treatment, the cells were seeded in an ampicillin-containing LB plate. On the next day, 
colonies were collected and confirmed by direct PCR for the DNA of interest, and the vector (pFBIF-G34A) was ex- 
tracted and purified. 

(3) Construction of bacmid by Bac-to-Bac system 

[0172] Next, a Bac-to-Bac system (Invitrogen Corporation) was used to cause recombination between the above 
pFBIF- and pFastBac, so that G34 and other sequences were inserted into a bacmid capable of growing in insect cells. 
[0173] This system utilizes a Tn7 recombination site and allows a gene of interest to be incorporated into a bacmid 
through a recombinant protein produced from a helper plasmid when pFastBac carrying the inserted gene of interest 
is merely introduced into bacmid-containing E. coli (DH10BAC, Invitrogen Corporation). In addition, such a bacmid 
contains the lacZ gene and allows selection based on the classical blue (not inserted)/white (inserted) colony screening. 
[0174] Namely, the vector purified above (pFBIH-G34A) was mixed with 50 uJ competent cells (E. coli DH10BAC). 
After heat shock treatment, the cells were seeded in a LB plate containing kanamycin, gentamicin, tetracycline, Bluo- 
gai and IPTG. On the next day, white single colonies were further cultured to collect the bacmid. 

Introduction of human G34 gene-containing bacmid into insect cells 

[0175] After confirming that the sequence of interest was inserted into the bacmid obtained from the above white 
colonies, this bacmid was introduced into insect cells (Sf21 , commercially available from Invitrogen Corporation). 
[0176] Namely, Sf21 cells were added to a 35 mm dish at 9 x 10 5 cells/2 ml antibiotic-containing Sf-900SFM (Inv- 
itrogen Corporation) and cultured at 27°C for 1 hour to allow cell adhesion. (Solution A) Purified bacmid DNA (5 
diluted with 100 u.l antibiotic-free Sf-900SFM. (Solution B) CellFECTIN Reagent (6 uJ, Invitrogen Corporation) diluted 
with 100 u.l antibiotic-free Sf-9O0SFM. Solutions A and B were then mixed carefully and incubated for 45 minutes at 
room temperature. After confirming cell adhesion, the culture solution was aspirated and replaced by antibiotic-free 
Sf-900SFM (2 ml). The solution prepared by mixing Solutions A and B (lipid-DNA complexes) was diluted and mixed 
carefully with antibiotic-free Sf900ll (800 u.l). The culture solution was aspirated from the cells and replaced by the 
diluted solution of lipid-DNA complexes, followed by incubation at 27°C for 5 hours. The transfection mixture was then 
removed and replaced by antibiotic-containing Sf-900SFM culture solution (2 ml), followed by incubation at 27°C for 
72 hours. At 72 hours after transfection, the ceils were released by pipetting and collected together with the culture 
solution, followed by centrifugation at 3000 rpm for 10 minutes. The resulting supernatant was stored in another tube 
(which was used as a first virus solution). 

[0177] Sf21 cells were introduced into a T75 culture flask at 1 x 10 7 cells/20 ml Sf-900SFM (antibiotic-containing) 
and incubated at 27°C for 1 hour. After the cells were adhered, the first virus (800 uJ) was added and cultured at 27°C 
for 48 hours. After 48 hours, the cells were released by pipetting and collected together with the culture solution, 
followed by centrifugation at 3000 rpm for 10 minutes. The resulting supernatant was stored in another tube (which 
was used as a second virus solution). 

[0178] Moreover, Sf21 cells were introduced into a T75 culture flask at 1 x 10 7 cells/20 ml Sf-900SFM (antibiotic- 
containing) and incubated at 27°C for 1 hour. After the cells were adhered, the second virus solution (100 uJ) was 
added and cultured at 27°C for 72 hours. After culturing, the cells were released by pipetting and collected together 
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with the culture solution, followed by centrifugation at 3000 rpm for 1 0 minutes. The resulting supernatant was stored 
in another tube (which was used as a third virus solution). In addition, Sf21 cells were introduced into a 1 00 ml spinner 
flask at a concentration of 6 x 1 0 5 cells/ml in a volume of 1 00 ml. The third virus solution (1 ml) was added and cultured 
at 27°C for about 96 hours. After culturing, the cells and the culture solution were collected and centrifuged at 3000 
5 rpm for 10 minutes. The resulting supernatant was stored in another tube (which was used as a fourth virus solution). 

Resin purification of G34 

[0179] The pFLAG-G34 supernatant of the above fourth virus solution (1 0 ml) was mixed with NaN 3 (0.05 %), NaCI 
10 (150 mM), CaCI 2 (2 mM) and anti-FLAG-M1 resin (100 jxl, SIGMA), followed by overnight stirring at 4°C. On the next 
day, the mixture was centrifuged (3000 rpm, 5 minutes, 4°C) to collect a pellet fraction. After addition of 2 mM CaCI 2 -TBS 
(900 uJ), centrifugation was repeated (2000 rpm, 5 minutes, 4°C) and the resulting pellet was suspended in 200 u.l of 
1 mM CaCI 2 -TBS for use as a sample for activity measurement (G34 enzyme solution). A part of this sample was 
electrophoresed by SDS-PAGE and Western blotted using anti-FLAG M2-peroxidase (SIGMA) to confirm the expres- 
15 sion of the G34 protein of interest. As a result, a plurality of bands were detected broadly around a position of about 
60 kDa (which would be due to differences in post-translation al modifications such as glycosylation), thus confirming 
the expression of the G34 protein. 

Example 2: Search for glycosy transferase activity of human G34 protein 
(1) Screening of GalNAc transferase activity 

[0180] The G34 protein was examined for its substrate specificity, optimum buffer, optimum pH and divalent ion 
requirement in its pi ,3-N-acetylgalactosaminyltransferase activity. 
25 [0181] The following reaction system was used for examining the G34 enzyme protein for its acceptor substrate 
specificity in its GalNAc transfer activity. 

[0182] In the reaction solutions shown below, each of the following was used at 10 nmol as an acceptor substrate: 
pNp-a-Gal, oNp-p-Gal, Bz-a-GlcNAc, pNp-p-GlcNAc, Bz-a-GalNAc, pNp-p-GalNAc, pNp-a-GIc, pNp-p-GIc, pNp- 
P-GIcA, pNp-a-Fuc, pNp-a-Xyl, pNp-p-Xyl and pNp-a-Man (all purchased from SIGMA), wherein "Gar represents a 
30 D-galactose residue, "Xyl" represents a D-xylose residue, "Fuc" represents a D-fucose residue, "Man" represents a 
D-mannose residue and "GIcA" represents a glucuronic acid residue. 

[0183] Each reaction solution was prepared as follows (final concentrations in parentheses): each substrate (10 
nmol), MES (2-morpholinoethanesulfonic acid) (pH 6.5, 50 mM), MnCI 2 (10 mM), Triton X-100 (trade name) (0.1 %), 
UDP-GalNAc (2 mM) and UDP-[ 14 C]GlcNAc (40 nCi) were mixed and supplemented with 5 nJG34 enzyme solution, 
35 followed by dilution with H 2 0 to a total volume of 20 uJ (see Table 1 ). 



Table 1 



Composition of reaction solutions (jxl) 




E(+).D<+); 


X8 


E(-). D(+) 


EM. D(-) 


Enzyme solution 


5 


40 


0 


5 


140 mM HEPES pH 7.4 


2 


16 


2 


2 


100 mM UDP-GalNAc 


0.5 


4 


0.5 


0 


200 mM MnCI 2 


1 


8 


1 


1 


10% Triton CF-54 


0.6 


4.8 


0.6 


0.6 


H 2 0 


5.9 


47.2 


10.9 


6.4 


10 nmol/uJ Acceptor 


5 


40 


5 


5 


Total 


20 




20 


20 



[0184] The above reaction mixtures were each reacted at 37°C for 16 hours. After completion of the reaction, 200 
u,l H 2 0 was added and each mixture was lightly centrifuged to obtain the supernatant. The supernatant was passed 
through a Sep-Pak plus C18 Cartridge (Waters), which had been washed once with 1 ml methanol and twice with 1 
ml H 2 0 and then equilibrated, to allow the substrate and product in the supernatant to adsorb to the cartridge. After 
washing the cartridge twice with 1 ml H 2 0, the adsorbed substrate and product were eluted with 1 ml methanol. The 
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eluate was mixed with 5 ml liquid scintillator ACSII (Amersham Biosciences) and measured for the amount of radiation 
with a scintillation counter (Beckman Coulter). 

[0185J As a result, the G34 protein was identified to be GalNAc transferase having the ability to transfer GalNAc to 
pNp-p-GlcNAc. The enzymatic activity was linearly increased at least over the course of the reaction time between 0 
and 16 hours when UDP-GlcNAc was used as a donor substrate and Bz-p-GlcNAc was used as an acceptor substrate 
(see Table 2 and Figure 1 ). 



Table 2 



w 



15 



Reaction time 


Area (%) 


1 hour 


0 . 


2 hours 


2.388 


4 hours 


6.195 


16 hours 


13.719 



Determination of linking mode 

[01 86] NMR was performed to analyze the linking mode of the sugar chain structure synthesized by the G34 enzyme 
20 protein. 

[0187] First, the reaction solution (final concentrations in parentheses) was prepared by adding Bz-0-GlcNAc (640 
nmol) as an acceptor substrate, HEPES buffer (pH 7.4, 14 mM), Triton CF-54 (trade name) (0.3 %), UDP-GalNAc (2 
mM), MnCI 2 (10 mM) and 500 uJ G34 enzyme solution, followed by dilution with H 2 0 to a total volume of 2 ml. This 
reaction solution was reacted at 37°C for 1 6 hours. The reaction solution was heated for 5 minutes at 95°C to stop the 

25 reaction and then purified by filtration through an Uitrafree-MC (Millipore Corporation). 

[0188] In one development, 50 uJ of the filtrate was analyzed by high performance liquid chromatography (HPLC) 
using a reversed-phase column ODS-80Ts QA (4.6 x 250 mm, Tosoh Corporation, Japan). The developing solvent 
used was an aqueous 9% acetonitrile-0.1% trifluoroacetic acid solution. The elution conditions were set to 1 ml/minute 
at 40°C. Absorbance at 210 nm was used as an index for elution peak detection using an SPD-IOA^ (Shimadzu 

30 Corporation, Japan). As a result, a new elution peak was observed, which was not detected in the control. This peak 
was separated and lyophilized for use as an NMR sample. 

[0189] NMR was performed using a DMX750 (Bruker Daltonics). As a result, the sample was determined as having 
a p1-3 linkage between GalNAc and GlcNAc-p1-o-Bz (see Figures 2A and 2B). The reasons for this determination are 
as follows (see Figures 2A and 2B, along with Figures 3 and 4): a) two residues (referred to as A and B) both have a 

35 piston coupling constant of 8.4 Hz for the signal at position 1 , suggesting that two pyranoses are in p-form; b) the spin 
coupling constants given in Figure 3 indicate that A shows a spin coupling constant characteristic of glucose, while B 
shows a spin coupling constant characteristic of galactose; c) it is A that is linked to the benzyl because NOE was 
observed between methylene proton of the benzyl and Al proton; d) there are two signals resulting from the methyl of 
N-acetyl and hence both residues are identified as N-acetylated sugars; and e) NOESY indicates the presence of NOE 

40 in B1-A3. 

[0190] On the other hand, examination was also performed on motif sequences involved in the above enzymatic 
activity. 

[0191] Figure 5 shows the putative amino acid sequence of the G34 protein (SEQ ID NO: 2) compared with the 
amino acid sequences of various human p1-3Ga1 transferases (p3Gal-T1 to -T6). In Figure 5, the boxed regions 
45 indicate the motifs common to Gai transferases. Among them, three motifs indicated with M1 to M3 are common to 
|31 ,3-iinking glycosyltransferases. In this figure, the amino acid residues indicated with * are conserved among the 
compared sequences. 

[0192] Figure 6 shows a comparison of three motifs involved in the ability to form (31 ,3 linkages (corresponding to 
the M 1 to M3 motifs in Figure 5) among various p1 -3GlcNAc transferases ((33Gn-T2 to -T5) and human Gal transferases 
50 T1 to T3, T5 and T6. In this figure, the amino acid residues indicated with * are conserved among the compared 
sequences. 

[01 93] As shown in Figures 5 and 6, it was indicated that the amino acid sequence of the G34 protein was conserved 
enough to have all the motifs (M1 to M3) involved in p1 ,3 linkages, upon comparison with the amino acid sequences 
of known various p1 ,3-linking glycosyltransferases. 
55 [0194] Thus, this motif examination also supported the conclusion that the G34 protein has the ability to transfer 
GalNAc to GlcNAc with pi .3 glycosidic linkage. 
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Optimum buffer and optimum pH 

[01 95] The following reaction system was used for examining the optimum buffer and pH for the GaIN Ac transferase 
activity of G34. The acceptor substrate used was pNp-p-GlcNAc. 

5 [0196] Any one of the following buffers was used (final concentrations in parentheses): MES (2-morpholinoethanesul- 
fonic acid) buffer (pH 5.5, 5.78, 6.0, 6.5 and 6.75, 50 mM), sodium cacodylate buffer (pH 5.0, 5.6, 6.0, 6.2, 6.6, 6.8, 
7.0, 7.2, 7.4 and 7.5, 25 mM) and N-p-hydroxyethylJpiperazine-N'-^-ethanesulfonic acid] (HEPES) buffer (pH 6.75, 
7.00, 7.30, 7.40 and 7.50, 1 4 mM). The substrate (1 0 nmol), MnCI 2 (1 0 mM), Triton CF-54 (trade name) (0.3%), UDP-Gal- 
NAc (2 mM) and UDP-[ 14 C]GlcNAC (40 nCi) were mixed and supplemented with 5 uJ G34 enzyme solution, followed 

w by dilution with H 2 0 to a total volume of 20 uJ. 

[0197] The above reaction mixtures were each reacted at 37°C for 16 hours. After completion of the reaction, 200 
u.l H 2 0 was added and each mixture was lightly centrifuged to obtain the supernatant. The supernatant was passed 
through a Sep-Pak plus C18 Cartridge (Waters), which had been washed once with 1 ml methanol and twice with 1 
ml H 2 0 and then equilibrated, to allow the substrate and product in the supernatant to adsorb to the cartridge. After 

15 washing the cartridge twice with 1 ml H 2 0, the adsorbed substrate and product were eluted with 1 ml methanol. The 
eiuate was mixed with 5 ml liquid scintillator ACSI I (Amersham Biosciences) and measured for the amount of radiation 
with a scintillation counter (Beckman Coulter). 

[0198] As indicated by the results (see Table 3 and Figure 7) s in MES buffer, G34 showed the same strong activity 
around pH 5.50 and pH 5.78 within the examined range and its activity decreased in a pH-dependent manner until pH 
20 6.5, but became strong again at pH 6.75. In sodium cacodylate buffer, the activity was highest at pH 5.0 within the 
examined range and the activity decreased in a pH -dependent manner until pH 6.2, increased in a pH-dependent 
manner until pH 7.0, and then plateaued until pH 7.4. In HEPES buffer, the activity increased in a pH-dependent manner 
and reached the highest value at pH 7.4 to 7.5 within the examined range. Among them, HEPES buffer at pH 7.4 to 
7.5 resulted in the strongest activity. 

25 

Table 3 



45 



50 



55 



PH 


4- 




Sodium cacodylate 


5.0 


6042 


204 


5838 


5.6 


3353 


159 


3194 


6.0 


2689 


260 


2429 


6.2 


907 


138 


769 


6.6 


1093 


136 


957 


6.8 


2488 


258 


2230 


7.0 


4965 


259 


4706 


7.2 


4377 


309 


4068 


7.4 


4930 


304 


4626 


PH 


+ 




MES 


5.50 


3735 


197 


3538 


5.78 


3755 


184 


3571 


6.00 


2514 


141 


2373 


6.50 


1981 


734 


1247 


6.75 


3289 


136 


3153 


PH 


+ 




HEPES 


6.75 


4894 


149 


4745 


7.00 


4912 


121 


4791 


7.30 


4294 


127 


4167 


7.40 


6630 


120 


6510 . 


7.50 


6895 


240 


6655 
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[0199] The following reaction system was used for examining the divalent ion requirement. The acceptor substrate 
used was Bz-p-GlcNAc. 

[0200] The reaction solution (final concentrations in parentheses) was prepared by adding the substrate (1 0 nmol), 
HEPES buffer (pH 7.4, 14 mM), Triton CF-54 (trade name) (0.3 %), UDP-GalNAc (2 mM), UDP-[ 14 C]GlcNAC (40 nCi) 
and 5 u.l G34 enzyme solution and further adding MnCI 2 , MgC1 2 or CoCI 2 at 2.5 mM, 5 mM, 1 0 mM, 20 mM or 40 mM, 
followed by dilution with H 2 0 to a total volume of 20 u,l. 

[0201] The above reaction mixture was reacted at 37°C for 16 hours. After completion of the reaction, 200 uJ H 2 0 
was added and the mixture was lightly centrifuged to obtain the supernatant. The supernatant was passed through a 
Sep-Pak plus C1 8 Cartridge (Waters), which had been washed once with 1 ml methanol and twice with 1 ml H 2 0 and 
then equilibrated, to allow the substrate and product in the supernatant to adsorb to the cartridge. After washing the 
cartridge twice with 1 ml H 2 0, the adsorbed substrate and product were eluted with 1 mi methanol. The eluate was 
mixed with 5 ml liquid scintillator ACSII (Amersham Biosciences) and measured for the amount of radiation with a 
scintillation counter (Beckman Coulter). 

[0202] The results (see Table 4 and Figure 8) indicated that the activity was enhanced by the addition of each divalent 
ion and confirmed that the G34 protein was an enzyme requiring divalent ions. Its activity nearly plateaued at 5 nM or 
higher concentration of Mn or Co and at 1 0 nM or higher concentration of Mg. Moreover, the Mn-induced enhancement 
of the activity was completely eliminated by addition of Cu. 



Table 4 



20 



25 



Rl assay (divalent ion requirement) 


Metal ion 


Concentration (mM) 


DPM 


Mn 


2.5 


7260.09 


5 


8270.23 


10 


7748.77 


20 


7515.86 


40 


4870.48 


40 


371 .53 


Co 


2.5 


10979.99 


5 


9503.91 


10 


10979.99 


20 


8070.47 


40 


7854.92 


Mg 


2.5 


4800.03 


5 


8692.15 


10 


8980.56 


20 


6726.32 


40 


5592.88 


none 




2427.39 


EDTA 


20 


149.32 


Mn+Cu 


10+10 


239 


none 




155.64 



Substrate specificity to oligosaccharides 

55 [0203] The following reaction system was used for examining the acceptor substrate specificity to oligosaccharides. 
The acceptor substrates used were pNp-a-Gal, oNp-p-Gal, Bz-a-GlcNAc, Bz-p-GlcNAc, Bz-a-GalNAc, pNp-p-GalNAc, 
pNp-a-GIc, pNp-p-GIc, pNp-p-GIcA, pNp-a-Fuc, pNp-a-Xyl, pNp-p-Xyl, pNp-a-Man, lactoside-Bz, Lac-ceramide, Gal- 
ceramide, paragloboside, globoside, Gal-pi-4 GalNAc-ce-pNp, Gal-p1-3 GlcNAc-p-pNp, GlcNAc-01-4 GlcNAc p-Bz, 
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10 



15 



pNp-core1 (Gal-p1-3 GalNAc-a-pNp), pNp-core2 (Gal-p1-3 (GlcNAc-p1-6) GalNAc-a-pNp), pNp-core3 (GlcNAc-p1-3 
GalNAc-a-pNp) and pNp-core6 (GlcNAc-p1-6 GalNAc-a-pNp). "Lac" represents a D-lactose residue. 
[0204] Each reaction solution (final concentrations in parentheses) was prepared by adding each substrate \50 nmol), 
HEPES buffer (pH 7.4, 14 mM). Triton CF-54 (trade name) (0.3 %), UDP-GalNAc (2 mM), MnCI 2 (10 mM), UDP-[ 3 H] 
GlcNAc and 5 u.l G34 enzyme solution, followed by dilution with H 2 0 to a total volume of 20 uJ. 

[0205] The above reaction mixtures were each reacted at 37°C for 2 hours. After completion of the reaction, 200 uJ 
H 2 0 was added and each mixture was lightly centrifuged to obtain the supernatant. The supernatant was passed 
through a Sep-Pak plus C18 Cartridge (Waters), which had been washed once with 1 ml methanol and twice with 1 
ml H 2 0 and then equilibrated, to allow the substrate and product in the supernatant to adsorb to the cartridge. After 
washing the cartridge twice with 1 ml H 2 0, the adsorbed substrate and product were eluted with 1 ml methanol. The 
eluate was mixed with 5 ml liquid scintillator ACSI I (Amersham Biosciences) and measured for the amount of radiation 
with a scintillation counter (Beckman Coulter). 

[0206] The results thus measured were compared assuming that the radioactivity obtained using Bz-p-GlcNAc as a 
substrate was set to 100% (see Table 5). When used as a substrate, pNp-core2 showed the largest increase in radi- 
oactivity. Bz-p-GlcNAc, GlcNAc-pi=4-GlcNAc-P-Bz, pNp-core6 and pNp-core3 also showed increases in radioactivity 
in the order named. The other substrates showed no increase in radioactivity. 



Table 5 



20 



25 



30 



35 



40 



45 



50 



No. 


Acceptor substrate 


% 


1 


pIMp-a-Gal 


N.D. 


2 


oNp-p-Gal 


N.D. 


3 


Bz-ct-GlcNAc 


N.D. 


4 


Bz-p-GlcNAc 


100 


5 


Bz-a-GalNAc 


N.D. | 


6 


pNp-p-GalNAc 


N.D. 


7 


pNp-a-GIc 


N.D. 


8 


pNp-p-Gic 


N.D. 


9 


pNp-p-GicA 


N.D. 


10 


pNp-a-Fuc 


N.D. 


11 


pNp-a-Xyl 


N.D. 


12 


pNp-p-Xyl 


N.D. 


13 


pNp-a-Man 


N.D. 


14 


Lactoside-Bz 


N.D. 


15 


Lac-ceramide 


N.D. 


16 


Gal-ceramide 


N.D. 


17 


Paragloboside 


N.D. 


18 


Globoside 


N.D. 


19 


Galpl -4GalNAc-a-pNp 


N.D. 


20 


Gaipi-3GlcNAc-p-pNp 


N.D. 


21 


GlcNAcpl -4GlcNAc-p-Bz 


29 


22 


corel -pNp 


N.D. 


23 


core2-pNp 


185 
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Table 5 (continued) 



No. 



Acceptor substrate 



% 



24 



core3-pNp 



8 



5 



25 




10 



N.D.: Not determined due to no radioactivity 
corel : Gal-p1 -3-GalNAc-a-pNp 
core2: Gal-01 -3-(GlcNAc-p1 -6)GalNAc-a-pNp 
core3: GlcNAc-01 -3-GalNAc-a-pNp 
core6: GlcNAc-p1 -6-GalNAc-a-pNp 



(2) Confirmation of activity by HPLC analysis 

15 [0207] Using uridine diphosphate-N-acetylgalactosamine (UDP-GalNAc; Sigma-Aldrich Corporation) as a sugar res- 
idue donor substrate and Bz-0-GlcNAc as a sugar residue acceptor substrate, the enzymatic activity of G34 was an- 
alyzed by high performance liquid chromatography (HPLC). 

[0208] The reaction solution (final concentrations in parentheses) was prepared by adding Bz-0-GlcNAc (1 0 nmol), 
HEPES buffer (pH 7.4, 14 mM), Triton CF-54 (trade name) (0.3 %), UDP-GalNAc (2 mM), MnCI 2 (10 mM) and 10 uJ 
20 G34 enzyme solution, followed by dilution with H 2 Q to a total volume of 20 uJ. This reaction solution was reacted at 
37°C for 16 hours. The reaction was stopped by addition of H 2 0 (100 pj) and the reaction solution was purified by 
filtration through an Ultrafree-MC (Millipore Corporation). 

[0209] The filtrate (1 0 uJ) was analyzed by high performance liquid chromatography (HPLC) using a reversed-phase 
column ODS-80Ts QA (4.6 x 250 mm, Tosoh Corporation, Japan). The developing solvent used was an aqueous 9% 
25 acetonitrile-0.1% trifluoroacetic acid solution. The elution conditions were set to 1 ml/minute at 40°C. Absorbance at 
210 nm was used as an index for elution peak detection using an SPD-IOA^ (Shimadzu Corporation, Japan). 
[0210] As a result, a new elution peak was observed, which was not detected in the control. 

(3) Analysis of reaction product by mass spectrometry 

30 

[021 1] The above peak was collected and the reaction product was analyzed by mass spectrometry. Matrix-associ- 
ated laser desorption ionization-time of flight/mass spectrometry (MALD I -TO F-MS) was performed using a Reflex IV 
(Bruker Daltonics). The sample at 10 pmol was dried and dissolved in 1 u.l distilled water for use as a MALDI-TOF-MS 
sample. 

35 [0212] As a result, a peak at 538.1 94 m/z was observed. This peak corresponded to the molecular weight of GalNAc- 
GlcNAc-Bz (sodium salt). 

[0213] This result also indicated that the G34 enzyme protein transfers GalNAc to Bz-p-GlcNAc. 
Example 3: Measurement for mRNA expression level of human G34 
(1) Expression levels in various human normal tissues 

[0214] Quantitative real-time PCR was used for comparing the mRNA expression levels of G34 in human normal 
tissues. Quantitative real-time PCR is a PCR method using a sense primer and an antisense primer in combination 
with a fluorescently-labeled probe. When a gene is amplified by PCR, a fluorescent label of the probe will be released 
to produce fluorescence. The fluorescence intensity is amplified in correlation with gene amplification and thus used 
as an index for quantification. 

[021 5] RNA of each human normal tissue (Clontech) was extracted with an RNeasy Mini Kit (QIAGEN) and converted 
into single strand DNA by the oligo(dT) method using a Super-Script First-Strand Synthesis System (Invitrogen Cor- 

50 poration). This DNA was used as a template and subjected to quantitative real-time PCR in an ABI PRISM 7700 (Applied 
Biosystems Japan Ltd.) using a 5'-primer (SEQ ID NO: 14), a S'-primer (SEQ ID NO: 15) and a TaqMan probe (SEQ 
ID NO: 16). PCR was performed under conditions of 50°C for 2 minutes and 95°C for 10 minutes, and then under 
conditions of 50 cycles of 95°C for 15 seconds and 60°C for 1 minute. To prepare a calibration curve, plasmid DNA 
obtained by introducing a partial sequence of G34 into pFLAG-CMV3 (Invitrogen Corporation) was used as a template 

55 and subjected to PCR as described above. 

[0216] The results confirmed that high-level expression was observed specifically in the testis, followed by skeletal 
muscle and prostate in the order named (Table 6). 
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Table 6 





G34 mRNA expression levels in human normal tissues 


5 




P.nn\/ n i imhor I V" 1 nnnn/i i n total PMA\ 

oupy iiurTiucr \ss i uuuu/fj.y, luiai niNMy 


oianaar a error 




Brain 


R n 


■i h 

i . 1 




Fptal hrain 

1 C LCI 1 U ■ CI 1 1 1 


1 n ^ 

I u.o 


U. I 


10 






u.o 


IVICUUIICl UUIUI lycHCl 


4 Q 


u.o 




ouui i idi i u i lju icii yiai iu 


D. / 


U.4 




i iiyfuiu yidiiu 


1 Q 

1 .O 


U.b 


15 


1 l dul lea 


o.y 


U.o 




1 > inn 


n zi 


U.l 






U. 1 


U.l 


20 




OC Q 


H H 

1 .1 


OlMall If UfcJoLII lc 


0. 1 


U.o 






u.o 


ft ft ' 

U.o 




Liver 


ft o 

u.o 


0.1 


25 


r eiai nver 


n "7 
U. f 


ft o 

0.3 




pancreas 




1 .1 




rxlvJI ItJy 


i .b 


0.3 


30 


r\arericii yieinu 


■( ft Q 

1 U.o 


1 .3 


i nymus 


>t o 
4.0 


0.2 




Bone marrow 


3.1 


\j . *+ 




Spleen 


4.2 


0.3 


35 


Testis 


115.5 


2.0 




Prostate 


14.6 


1.5 




Mammary gland 


5.2 


0.2 


40 


Uterus 


5.0 


0.2 




Placenta 


1.4 


0.4 



(2) Expression levels in human cancer cell lines 



45 [°217] Quantitative real-time PCR as mentioned above was used for comparing the mRNA expression levels of G34 
in various cancer-derived human cell lines. After cells of each human cell line were collected, RNA was extracted with 
an RNeasy Mini Kit (QIAGEN) and converted into single strand DNA by the oligo(dT) method using a Super-Script 
First-Strand Synthesis System (Invitrogen Corporation). This DNA was used as a template and subjected to quantitative 
real-time PCR in an ABI PRISM 7700 (Applied Biosystems Japan Ltd.) using a 5 , -primer (SEQ ID NO: 14), a3'-primer 

50 (SEQ ID NO: 15) and a TaqMan probe (SEQ ID NO: 1 6). PCR was performed under conditions of 50°C for 2 minutes 
and 95°C for 10 minutes, and then under conditions of 50 cycles of 95°C for 15 seconds and 60°C for 1 minute. 
[0218] As a result, the expression was observed in all the human cell lines (Table 7, Figure 9). 
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Table 7 



G34 mRNA expression levels in human cell lines 







number 






Copy 
number 








Cell 


(xioV^g, 
total 






(xioV^g, 

total 








line 


RNA) 




Cell line 


RNA] 








SCCH-26 


7.9 


0.6 




ESI 


23.0 


2.5 




NAG A I 


19.5 


1.5 


Esophageal 


ES2 


16. 1 


0.6 


Neuro- 
blastoma 


NB-9 


40.6 


2.3' 


cancer 


ES6 


42.8 


3.0 


SK-N-SH 


14.9 


0.7 




MKN1 


6.2 


1.1 


SK-N-MC 


5.8 


0.5 




MKN2 8 


8.6 


1.0 




NB-1 


20. 9 


0.5 




MKN7 


9.7 


0.1 




IMR32 


21. 0 


0.2 


Gastric 


MKN74 


3.5 


0.8 




T98G 


6.2 


0.2 


cancer 


MKN-45 


7.3 


2.1 




*KG-1 


3. 9 


0.0 




HSC-43 


42.8 


1.7 




A172 


13.4 


0.9 




KATOIII 


6.4 


0.4 


Glioma 


GI-1 


13.7 


1.3 




TMK-1 


10.8 


1.2 




U118MG 


6.8 


0.5 




LSC 


11.8 


0.6 




U251 


28. 9 


1.9 




LSB 


4. 9 


0.3 




KG-l-C 


9.1 


0.6 




SW48 0 


10.1 


0.4 




L\il30 


6.8 


0.4 


Large 


SW1116 


24. 1 


1.4 




Lul34A 


30. 3 


1.2 


intestine 


Colo201 


10.4 


0.4 




Lul34B 


6.8 


0.4 


(colon) 


Colo205 


6.8 


0.9 




LU135 


7.2 


1.3 


cancer 


CI 


21.9 


1.2 




Lul3 9 


10.7 


0.5 




WiDr 


1.2 


0.0 




Lul40 


15. 4 


1.8 




HCT8 


82. 2 


6.2 




SBC-1 


2.5 


0.2 




HCT15 


12.1 


1.0 




PC- 7 


9.1 


0.2 




A204 


67.9 


4.4 




PC - 9 


22.4 


0.1 




A-431 


30.6 


2.5 


Lung 


HAL - 8 


15.2 


1.2 




SW1736 


11.9 


1.1 


cancer 


KAL-24 


20.8 


1.7 




HepG2 


2.3 


0.3 




^BC- 1 


10.3 


0.9 


Others 


"apan- 2 


19.4 


1.2 


I 


RERF-LC- 














I 


AC 


22 . 8 


2 . 2 




293T 


55.1 


8.3 


I 


lHHA-9 


20. 3 


7.9 


] 


PA-1 


3.5 


0.6 


1 


>C-1 


2.1 


0.2 


l 


4L-60 


2.1 


0.1 


I 


2BC-1 


4.4 


0.2 


Leukemia 

I 


<-562 


17. 1 


1.8 


I 


>C-10 


118.8 


4.9 


1 


Daudi 


2.4 


0.2 


I 


^549 


27.1 


2.6 


I 


4amalwa 


13.0 


1.2 


I 


.X-l 


30.7 1 


2.1 




CHM-IB 


16.4 


0.4 








Lymphoma j 


lamos 


9.5 


0.7 








I 


*aji 


11.6 


1.3 










rurkat 


42.7 


1.9 
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(3) Expression levels In cancerous tissues 

[0219] Quantitative real-time PCR as mentioned above was used for comparing the mRNA expression levels of G34 
in cancer tissues and their surrounding normal tissues derived from patients with large intestine (colon) cancer and 
5 lung cancer. 

[0220] From cancer and normal tissues of the same patient, RNA was extracted with an RNeasy Mini Kit (QIAGEN) 
and converted into single strand DNA by the oligo(dT) method using a Super-Script First-Strand Synthesis System 
(Invitrogen Corporation). This DNA was used as a template and subjected to quantitative real-time PCR in an ABI 
PRISM 7700 (Applied Biosystems Japan Ltd.) using a 5'-primer (SEQ ID NO: 14), a 3'-primer (SEQ ID NO: 15) and a 
10 TaqMan probe (SEQ ID NO: 16). PCR was performed under conditions of 50 cycles of 50°C for 2 minutes, 95°C for 
10 minutes, 95°C for 15 seconds, and 60°C for 1 minute. To correct variations among individuals, the resulting data 
were divided by the value of 0-actin (internal standard gene) quantified using a kit of Applied Biosystems Japan before 
being compared. 

[0221] The results indicated that the mRNA expression level of the G34 gene was significantly increased in these 
is cancerous tissues (Table 8, Table 9). 



Table 8 





G34 mRNA expression levels in tissues from large intestine cancer patients 


20 


Patient No. 


Normal tissue 


Standard error 


Cancer tissue 


Standard error 


%Change 




1 


0.15 


0.04 


0.35 


0.07 


2.3 




2 


0.15 


0.07 


8.63 


0.65 


58.0 


25 


3 


0.07 


0.02 


1.55 


0.15 


23.5 


4 


0.08 


0.05 


1.82 


0.26 


22.0 




5 


0.08 


0.02 


0.60 


0.07 


7.2 




6 


1.04 


0.08 


1.92 


0.21 


1.8 


30 


7 


0.07 


0.02 


5.37 


1.06 


81.3 




8 


1.54 


0.27 


8.30 


0.96 


5.4 




9 


0.05 


0.04 


1.70 


0.37 


34.3 


35 


10 


0.05 


0.04 


0.10 


0.04 


2.0 


11 


0.60 


0.29 


10.23 


1.47 


17.2 




12 


0.17 


0.13 


2.36 


0.43 


14.3 




13 


0.18 


0.09 


1.70 


0.27 


9.4 


40 


14 


0.18 


0.08 


2.76 


0.23 


15.2 




15 


0.18 


0.05 


3.49 


0.34 


19.2 




16 


0.20 


0.15 


1.84 


0.25 


9.3 


45 


17 


0.28 


0.05 


7.41 


0.51 


26.4 


18 


0.05 


0.04 


5.92 


0.38 


119.3 




19 


0.15 


0.11 


4.68 


0.67 


31.4 




20 


0.13 


0.06 


4.61 


2.22 


34.9 


50 


21 


0.02 


0.02 


8.40 


1.65 


508.0 




22 


0.20 


0.07 


3.57 


0.43 


18.0 




23 


0.55 


0.27 


2.33 


1.23 


4.3 


55 


Average 


0.25 


0.07 


3.97 


0.55 


15.6 


Copy number (xl0000/uxj, total RNA) 
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Table 9 



10 



30 



G34 mRNA expression levels in tissues from lung cancer patients 


Patient No. 


Normal tissue 


Standard error 


Cancer tissue 


Standard error 


%Change 


1 


0.48 


0.06 


2:03 


0.27 


4.2 | 


3 


0.00 


0.00 


0.55 


0.21 


- 


4 


2.43 


0.40 


6.13 


0.17 


2.5 


5 


0.10 


0.04 


2.74 


0.32 


27.7 


6 


1.69 


0.28 


3.11 


0.69 


1.8 


7 


0.60 


0.16 


2.76 


0.35 


4.6 


8 


2.30 


0.38 


6.23 


0.21 


2.7 


9 


1.26 


0.27 


2.51 


0.10 


2.0 


10 


1.47 


0.18 


4.76 


0.57 


3.2 


11 


0.64 


0.00 


1.14 


0.11 


1.8 


12 


0.56 


0.06 


0.69 


0.04 


1.2 


13 


1.32 


0.02 


1.98 


0.15 


1.5 


14 


0.17 


0.02 


0.66 


0.02 


4.0 


15 


0.71 


0.05 


2.71 


0.13 


3.8 


16 


1.07 


0.13 


15.64 


1.11 


14.6 


17 


1.03 


0.12 


8.27 


0.73 


8.1 


18 


0.13 


0.02 


1.95 


0.09 


14.8 


Average 


0.94 


0.71 


3.76 


3.64 


4.0 


Copy number (xl0000/u.g, total RNA) 



Example 4: Cloning and expression of mouse G34 gene 

35 " ~ ~~ ^ 

[0222] The human G34 sequence obtained in Example 1 was used as a query for a search against the mouse gene 
sequence serela (Applied Biosystems) to thereby find a corresponding nucleic acid sequence with high homology. The 
open reading frame (ORF) estimated from this nucleic acid sequence is composed of 1515 bp (SEQ ID NO: 3), i.e., 
504 amino acids (SEQ ID NO: 4) when calculated as an amino acid sequence, and has a hydrophobic amino acid 

40 region characteristic of glycosyltransferases at its N-terminal end. This sequence shares a homology of 86% (nucleic 
acid sequence) and 88% (amino acid sequence) with human G34 (SEQ ID NOs: 1 and 2) (see Figure 10). Moreover, 
the sequence retains all of the three motifs conserved in the p3GalT family. The product encoded by the nucleic acid 
sequence of SEQ ID NO: 3 and the amino acid sequence of SEQ ID NO: 4 was designated mouse G34 (mG34). 
[0223] To examine the activity of mG34, G34 was allowed to be expressed in a mammalian cell line. In this example, 

45 the active region covering amino acid 35 to the C-terminal end of mG34 was genetically introduced into a mammalian 
cell line expression vector pFLAG-CMV3 using a FLAG Protein Expression system (Sigma-Aldrich Corporation). 
[0224] The expression in mouse tissues was confirmed by PCR. Each mouse tissue (brain, thymus, stomach, small 
intestine, large intestine (colon), liver, pancreas, spleen, kidney, testis or skeletal muscle) was used as a template and 
subjected to PCR using a 5'-primer (mG34-CMV-F1 ; SEQ ID NO: 17) and a 3'-primer (mG34-CMV-R1 ; SEQ ID NO: 

so 18). PCR was performed under conditions of 25 cycles of 98°C for 10 seconds, 55°C for 30 seconds, and 72°C for 2 
minutes. The PCR product was electrophoresed on an agarose gel to confirm a band of approximately 1 500 bp. As a 
result, as shown in Table 10, the expression level was highest in the testis, followed by spleen and skeletal muscle in 
the order named. 
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Table 10 



mG34 mRNA expression levels in mouse tissues 


Tissue 


Expression level 


Brain 


± 


Thymus 




Stomach 


+ 


Small intestine 




Large intestine (colon) 




Liver 


+ 


Pancreas 




Spleen 




Kidney 


++ 


Testis 


+++ 


Skeletal muscle 


++ 



[0225] Mouse testis-derived cDNA was used as a template and subjected to PCR using a 5'-primer (mG34-CM V-F1 ; 
SEQ ID NO: 17) and a 3'-primer (mG34-CMV-R1 ; SEQ ID NO: 18) to obtain a DNA fragment of interest. PCR was 
25 performed under conditions of 25 cycles of 98°C for 10 seconds, 55°C for 30 seconds, and 72°C for 2 minutes. The 
PCR product was then electrophoresed on an agarose gel and isolated in a standard manner after gel excision. This 
PCR product has restriction enzyme sites Hindlll and Notl at the 5' and 3' sides, respectively. 

[0226] After this DNA fragment and pFLAG-CMV3 were each treated with restriction enzymes Hindlll and Notl, the 
reaction solutions were mixed together and subjected to ligation reaction, so that the DNA fragment was introduced 
30 into pFLAG-CMV3. The reaction solution was purified by ethanol precipitation and then mixed with competent cells 
(E. co//DH5oc). After heat shock treatment (42°C, 30 seconds), the cells were seeded on ampicillin-containing LB agar 
medium. 

[0227] On the next day, the resulting colonies were confirmed by direct PCR for the DNA of interest. For more reliable 
results, after sequencing to confirm the DNA sequence, the vector (pFLAG-CMV3-mG34A) was extracted and purified. 

35 [0228] Human kidney cell-derived cell line 293T cells (2 x 10 6 ) were suspended in 10 ml antibiotic-free DM EM 
medium (Invitrogen Corporation) supplemented with 10% fetal bovine serum, seeded in a 1 0 cm dish and cultured for 
16 hours at 37°C in a C0 2 incubator. pFLAG-CMV3-mG34A (20 ng) and Lipofectamin 2000 (30 uJ, Invitrogen Corpo- 
ration) were each mixed with 1.5 ml OPTI-MEM (Invitrogen Corporation) and incubated at room temperature for 5 
minutes. These two solutions were further mixed gently and incubated at room temperature for 20 minutes. This mixed 

40 solution was added dropwise to the dish and cultured for 48 hours at 37°C in a C0 2 incubator. 

[0229] The supernatant (1 0 ml) was mixed with NaN 3 (0.05 %), NaCI (1 50 mM), CaCI 2 (2 mM) and anti-MI resin (1 00 
uJ, SIGMA), followed by overnight stirring at 4°C. On the next day, the supernatant was centrifuged (3000 rpm, 5 
minutes, 4°C) to collect a pellet fraction. After addition of 2 mM CaCI 2 -TBS (900 u.l), centrifugation was repeated (2000 
rpm, 5 minutes, 4°C) and the resulting pellet was suspended in 200 of 1 mM CaCI 2 -TBS for use as a sample for 

45 activity measurement (mouse G34 enzyme solution). A part of this sample was electrophoresed by SDS-PAGE and 
Western blotted using anti-FLAG M2-peroxidase (SIGMA) to confirm the expression of the mG34 protein of interest. 
As a result, a band was detected at a position of about 60 kDa, thus confirming the expression of the mG34 protein. 

Example 5: Search for glycosyltransf erase activity of mouse G34 

50 

[0230] The following reaction system was used for examining mouse G34 for its substrate specificity in its (31 ,3-N- 
acetylgalactosamine transferase activity. In the reaction solutions shown below, each of the following was used at 10 
nmol as an "acceptor substrate": pNp-a-Gal, oNp-p-Gal, Bz-oc-GlcNAc, Bz-p-GlcNAc, Bz-a-GalNAc, pNp-0-GalNAc, 
pNp-a-GIc, pNp-p-GIc, pNp-p-GIcA, pNp-a-Fuc, pNp-a-Xyl, pNp-0-Xyl, pNp-a-Man, lactoside-Bz, Lac-ceramide, Gal- 
5 5 ceramide, Gb3, globoside, Gal-p1 -4GalNAc-a-pNp, Galp1-3GlcNAc-p-Bz, GlcNAc-pi-4-GlcNAc-p-Bz, core1-pNp, 
core2-pNp, core3-pNp and core6-pNp (all purchased from SIGMA). 

[0231] Each reaction solution was prepared as follows (final concentrations in parentheses): each substrate (10 
nmol), HEPES (N-[2-hydroxyethyl]piperazine-N*-[2-ethanesuifonic acid]) (pH 7.4, 14 mM), MnCI 2 (10 mM), Triton CF- 
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54 (trade name) (0.3 %), UDP-GalNAc (2 mM) and UDP-[ 14 C]GlcNAC (40 nCi) were mixed and supplemented with 5 
\i\ mouse G34 enzyme solution, followed by dilution with H 2 0 to a total volume of 20 ul 

[0232] The above reaction mixtures were each reacted at 37°C for 1 6 hours. After completion of the reaction, 200 
u,l H 2 0 was added and each mixture was lightly centrifuged to obtain the supernatant. The supernatant was passed 
through a Sep-Pak plus C18 Cartridge (Waters), which had been washed once with 1 ml methanol and twice with 1 
ml H 2 0 and then equilibrated, to allow the substrate and product in the supernatant to adsorb to the cartridge. After 
washing the cartridge twice with 1 ml H 2 0, the adsorbed substrate and product were eluted with 1 ml methanol. The 
eluate was mixed with 5 ml liquid scintillator ACSII (Amersham Biosciences) and measured for the amount of radiation 
with a scintillation counter (Beckman Coulter). 

[0233] The results thus measured were compared assuming that the radioactivity obtained using Bz-p-GlcNAc as a 
substrate was set to 100% (Table 11). When used as a substrate, Bz-p-GlcNAc showed the largest increase in radio- 
activity. core2-pNp, core6-pNp, core3-pNp, pNp-p-GIc and GlcNAc-01-4-GlcNAc-0-Bz also showed high radioactivity 
in the order named. The other substrates showed no increase in radioactivity. 



Table 11 



Acceptor substrate 


% 


pNp-a-Gal 


ND 


oNp-p-Gal 


ND 


Bz-oc-GlcNAc 


ND 


Bz-p-GlcNAc 


100 


Bz-a-GalNAc 


ND 


pNp-p-GalNAc 


ND 


pNp-a-GIc 


ND 


pNp-P-GIc 


12 


pNp-p-GIcA 


ND 


pNp-a-Fuc 


ND 


pNp-a-Xyl 


ND 


pNp-p-Xyl 


ND 


pNp-a-Man 


ND 


Lactoside-Bz 


ND 


Lac-ceramide 


ND 


Gal-ceramide 


ND 


Gb3 


ND 


Globoside 


ND 


Galpl -4GalNAc-a-pNp 


ND 


Gaipi-3GlcNAc-p-pNp 


ND 


GlcNAcpl -4GlcNAc-p-Bz 


10 


corel -pNp 


ND 


core2-pNp 


25 


core3-pNp 


14 


core6-pNp 


18 



Example 6: In situ hybridization on mouse testis 

[0234] in situ hybridization using mG34 was performed on a mouse testis-derived sample to confirm the expression 
of mG34 in the mouse testis sample (see Figure 11). 
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Example 7: Creation of G34 knockout mouse 

[0235] A targeting vector (pBSK-mG34-KOneo) is constructed in which pBluescript II SK(-) (TOYOBO) is inserted 
with a chromosomal fragment (about 10 kb) primarily composed of an approximately 10 kb fragment covering exons 

5 (i.e., Exons 3 to 12 (1242 bp) within the ORF region of mG34) containing activation domains of the gene (mG34) to 
be knocked out. pBSK-mG34-KOneo is also designed to have the drug resistance gene neo (neomycin resistance 
gene) introduced into Exons 7 to 9 which are putative GalNAc transferase active regions of mG34. As a result, Exons 
7 to 9 of mG34 are deleted and replaced by neo. The pBSK-mG34-KOneo thus obtained is linearized with a restriction 
enzyme Notl, 80 u.g of which is then transfected (e.g., by electroporation) into ES cells (derived from E14/129Sv mice) 

10 to select G418-resistant colonies. The G4 18- resistant colonies are transferred to 24-well plates and then cultured. 
After a part of the cells are frozen and stored, DNA is extracted from the remaining ES cells and around 120 colonies 
of recombinant clones are selected by PCR. Further, Southern blotting or other techniques are performed to confirm 
whether recombination occurs as expected, finally selecting around 10 clones of recombinants. ES cells from two of 
the selected clones are injected into C57BL/6 mouse blastocysts. The mouse embryos injected with the ES cells are 

'5 transplanted into the uteri of recipient mice to generate chimeric mice, followed by germline transmission to obtain 
heterozygous knockout mice. 
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SEQUENCE LISTING 

5 

<H0> National Institute of Advanced Industrial Science and Technology 
Fuj i rebio Incorporated 
10 <120> pi, 3-N-Acetyl-D-galaclosaminytransferase protein and nucleic 
acid encoding the same, as well as canceration assay using 
the same 

15 

<130> PC/S-84-6 

20 

<160> 27 

25 <210> 1 

<2 1 1 > 1503 
<212> DNA 

30 

<213> Homo sapiens 
<400> 1 

atgcgaaact ggctggtgct gctgtgcccg tgtgtgctcg gggccgcgct gcacctctgg 60 
ctgcggctgc gctccccgcc gcccgcctgc gcctccgggg ccggccctgc agatcagttg 120 
40 gccttatttc ctcagtggaa atctactcac tatgatgtgg tagttggcgt gttgtcagct 180 
cgcaataacc atgaacttcg aaacgtgata agaagcacct ggatgagaca tttgctacag 240 
catcccacat taagtcaacg tgtgcttgtg aagttcataa taggtgctca tggctgtgaa 300 
gtgcctgtgg aagacaggga agatccttat tcctgtaaac tactcaacat cacaaatcca 360 
gttttgaatc aggaaattga agcgttcagt ctglccgaag acacltcatc ggggctgcct 420 
gaggatcgag ttgtcagcgt gagtttccga gttctctacc ccatcgttat taccagtctt 480 

50 

ggagtgttct acgatgccaa tgatgtgggt ttccagagga acatcactgt caaactttat 540 
caggcagaac aagaggaggc cctcttcatt gctcgcttca gtcctccaag ctgtggtgtg 600 
55 caggtgaaca agctgtggta caagcccgtg gaacaattca tcttaccaga gagctttgaa 660 
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ggt aca a t eg 


tgtgggagag 


ccaagacctc 


cacggcc t tg 


tgtcaagaaa 


tctccacaaa 


720 


gt gacagiga 


atgatggagg gggagttctc 


agagi ca t la 


cage tgggga 


gggtgcattg 780 


ccicatgaat 


tct tggaagg 


t f?t ££?a£?££a 
l 6 1 66 a 666 o 


gt tgcaggtg 


gtt t tatata 


taptat trap 


840 


gaaggtgatg 


c tctct taca 


caacc 1 1 ca t 


tc tcgccc tc 


aaagact tat 


t f?a t c a t a t a 


900 

J u u 


aggaatc tec 


a tgaggaaga 


t ere c t 1 ac i 9. 


aaggaggaaa 


gcagcatc ta 


f aa j aa t a t t 


960 

J U \J 


gtt tltgtgg 


a tgt tgtcga 


car t t a t rf? t 


aa tgt tec tg 


caaaat tat t 




1 020 


agalggactg 


tggaaacaac 


5Q5L l Had I 


t tgt tgc tga 


agacagatga 




1 n°n 

i uou 


atagacctcg aagctgtat t 


1 a a 1 (155(1 1 1 


gtccaaaaga 


atctggatgg 


5IL I da t I I I 


1 1 4fl 

1 J 


tggtggggaa 


at t tcagact 


gaat tgggca 


gt tgaccgaa 


ccggaaagtg 


gcaggagt tg 


1200 


gagtacccga 


gccccgct ta 


ccc tgect 1 1 


gcatgtgggt 


caggatatgt 


gatctccaag 


1260 


gaeategtea 


agtggc tggc 


aagcaac teg 


gggaggt taa 


agacctatca 


gggtgaagat 


1320 


gtaagcatgg 


gcatctggat 


ggc tgecata 


ggacctaaaa 


gataccagga 


cagtctgtgg 


1380 


ctgtgtgaga 


agacctgtga 


gacaggaatg 


ctgtct tctc 


c tcagtat tc 


tccgtgggaa 


1440 


c tgaeggaac 


tgtggaaact 


gaaggaaegg 


tgcggtgat c 


c t tgtcgatg 


tcaagcaaga 


1500 


taa 












1503 



<210> 2 
3s <211> 500 

<212> PRT 

<213> Homo sapiens 

40 

<400> 2 

Met Arg Asn Trp Leu Val Leu Leu Cys Pro Cys Val Leu Gly Ala Ala 

45 

15 10 15 

Leu His Leu Trp Leu Arg Leu Arg Ser Pro Pro Pro Ala Cys Ala Ser 

so 20 25 30 

Gly Ala Gly Pro Ala Asp Gin Leu Ala Leu Phe Pro Gin Trp Lys Ser 
35 40 45 

55 Thr His Tyr Asp Val Val Val Gly Val Leu Ser Ala Arg Asn Asn His 
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w 



75 



20 



25 



30 



35 



40 



45 



50 



55 



50 55 60 

Glu Leu Arg Asn Val lie Arg Ser Thr Trp Met Arg His Leu Leu Gin 
65 70 75 80 

His Pro Thr Leu Ser Gin Arg Val Leu Val Lys Phe lie lie Gly Ala 

85 90 95 

His Gly Cys Glu Val Pro Val Glu Asp Arg Glu Asp Pro Tyr Ser Cys 

100 105 110 

Lys Leu Leu Asn He Thr Asn Pro Val Leu Asn Gin Glu He Glu Ala 

115 120 125 

Phe Ser Leu Ser Glu Asp Thr Ser Ser Gly Leu Pro Glu Asp Arg Val 

130 135 140 

Val Ser Val Ser Phe Arg Val Leu Tyr Pro He Val He Thr Ser Leu 
145 150 155 160 

Gly Val Phe Tyr Asp Ala Asn Asp Val Gly Phe Gin Arg Asn He Thr 

165 170 175 

Val Lys Leu Tyr Gin Ala Glu Gin Glu Glu Ala Leu Phe lie Ala Arg 

180 185 190 

Phe Ser Pro Pro Ser Cys Gly Val Gin Val Asn Lys Leu Trp Tyr Lys 

195 200 205 

Pro Val Glu Gin Phe He Leu Pro Glu Ser Phe Glu Gly Thr He Val 

210 215 220 

Trp Glu Ser Gin Asp Leu His Gly Leu Val Ser Arg Asn Leu His Lys 
225 230 235 240 

Val Thr Val Asn Asp Gly Gly Gly Val Leu Arg Val He Thr Ala Gly 

245 250 255 

Glu Gly Ala Leu Pro His Glu Phe Leu Glu Gly Val Glu Gly Val Ala 

260 265 270 

Gly Gly Phe He Tyr Thr He Gin Glu Gly Asp Ala Leu Leu His Asn 
275 280 285 
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Leu His Ser Arg Pro Gin Arg Leu He Asp His He Arg Asn Leu His 

290 295 300 

Glu Glu Asp Ala Leu Leu Lys Glu Glu Ser Ser He Tyr Asp Asp He 
305 310 315 320 

Val Phe Val Asp Val Val Asp Thr Tyr Arg Asn Val Pro Ala Lys Leu 

325 330 335 

Leu Asn Phe Tyr Arg Trp Thr Val Glu Thr Thr Ser Phe Asn Leu Leu 

340 345 350 

Leu Lys Thr Asp Asp Asp Cys Tyr He Asp Leu Glu Ala Val Phe Asn 

355 360 365 

Arg He Val Gin Lys Asn Leu Asp Gly Pro Asn Phe Trp Trp Gly Asn 

370 375 380 

Phe Arg Leu Asn Trp Ala Val Asp Arg Thr Gly Lys Trp Gin Glu Leu 
385 390 395 400 

Glu Tyr Pro Ser Pro Ala Tyr Pro Ala Phe Ala Cys Gly Ser Gly Tyr 

405 410 415 

Val He Ser Lys Asp He Val Lys Trp Leu Ala Ser Asn Ser Gly Arg 

420 425 430 

Leu Lys Thr Tyr Gin Gly Glu Asp Val Ser Met Gly lie Trp Met Ala 

435 440 445 

Ala He Gly Pro Lys Arg Tyr Gin Asp Ser Leu Trp Leu Cys Glu Lys 

450 455 460 

Thr Cys Glu Thr Gly Met Leu Ser Ser Pro Gin Tyr Ser Pro Trp Glu 
465 470 475 480 

Leu Thr Glu Leu Trp Lys Leu Lys Glu Arg Cys Gly Asp Pro Cys Arg 
485 490 495 

Cys Gin Ala Arg 
500 
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<2l0> 3 
<2ll> l 5 l 5 
<2!2> DNA 
<213> Mouse 

<400> 3 

atgcgaaact ggctggtgct gctgtgccct tgcgtgclcg gggccgcgct gcacctctgg 60 
cacctctggc tccgttcccc gccggacccc cacaacaccg ggcccagcgc ggcagatcaa 120 
tcagccttat ttcctcactg gaaatttagc cactalgatg tggtagttgg tgtgttalca 180 
gctcgaaata accacgaact tcgaaatgtg ataaggaaca cctggctgaa gaattlgctg 240 
catcalccta cattaagtca acgtgtgctt gtgaagtlca taataggtgc ccgtggcigt 300 
gaagtgcctg tggaagacag ggaggatcct tactcctgcc gactgctcaa catcaccaat 360 
ccagttttga atcaagaaat tgaggcattc agctttcctg aagatgcctc ctcatctaga 420 
ctctctgaag accgagttgt cagcgtgagc ttcagagltc tctacccaat cgtgattacc 480 
agtcttggag tgttctacga tgccagtgat gttggttltc aaaggaacat cacagtcaag 540 
ttgtatcaga cagagcagga ggaggccctt ttcatcgccc gattcagtcc tccaagttgt 600 
ggcgtacaag tgaacaagct ctggtataag cccgtggaac agttcatctt accagagagc 660 
tttgaaggta caatcgtgtg ggaaagccaa gatctccatg gcctcgtgtc cagaaacclg 720 
cacagagtga cagtgaatga tggagggggt gttctcagag tccttgcagc tggggaaggg 780 
gcactgcctc atgaattcat ggaaggtgtg gagggagttg cgggtggctt tatctacact 840 
gttcaggaag gtgatgcact attaagaagc ctttattctc ggccccagag acttgcagai 900 
cacatacagg atctgcaggt ggaagatgcc ttactgcagg aggaaagcag tgtccatgac 960 
gacattglct tcgtggatgl tgtggatact taccggaatg tlcctgcaaa attactgaac 1020 
ttctatagal ggaclgtgga atccaccagc ttcgatttgc lgcicaagac agatgacgac 1080 
tgttatatag acttagaagc tgtgtttaat agaattgctc agaagaatct agatgggcct 1140 
aatttttggt ggggaaatlt caggltgaat tgggcaglgg acagaaccgg aaaatggcag 1200 
gagctggaat acccgagccc ggctlaccct gcctttgcat gtgggtcagg gtatglgatc 1260 
tccaaggata tcgttgactg gctggcaggc aactccagaa ggtlaaagac ctatcagggt 1320 
gaagatglca gcatgggcal tlggatggca gccataggac ctaaaagaca ccaggacagc 1380 
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ctgtggctgt gtgagaaaac ctgtgagaca ggaatgctgt cttctcctca gtactcacca 1440 
gaagagctga gcaaactctg ggaactgaag gagctgtgtg gggatccttg tcagtgtgaa 1500 
gcaaaagtac gatga l 5 l 5 

<210> 4 
<21l> 504 
<2l2> PRT 
<213> Mouse 

<400> 4 

Met Arg Asn Trp Leu Val Leu Leu Cys Pro Cys Val Leu Gly Ala Ala 

15 10 15 

Leu His Leu Trp His Leu Trp Leu Arg Ser Pro Pro Asp Pro His Asn 

20 25 30 

Thr Gly Pro Ser Ala Ala Asp Gin Ser Ala Leu Phe Pro His Trp Lys 

35 40 45 

Phe Ser His Tyr Asp Val Val Val Gly Val Leu Ser Ala Arg Asn Asn 

50 55 60 

His Glu Leu Arg Asn Val He Arg Asn Thr Trp Leu Lys Asn Leu Leu 
55 70 75 80 

His His Pro Thr Leu Ser Gin Arg Val Leu Val Lys Phe He He Gly 

85 90 95 

Ala Arg Gly Cys Glu Val Pro Val Glu Asp Arg Glu Asp Pro Tyr Ser 

100 105 110 

Cys Arg Leu Leu Asn He Thr Asn Pro Val Leu Asn Gin Glu He Glu 

115 120 125 

Ala Phe Ser Phe Pro Glu Asp Ala Ser Ser Ser Arg Leu Ser Glu Asp 

130 135 140 

Arg Val Val Ser Val Ser Phe Arg Val Leu Tyr Pro He Val He Thr 
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145 150 155 160 

Ser Leu Gly Val Phe Tyr Asp Ala Ser Asp Val Gly Phe Gin Arg Asn 

165 170 175 

He Thr Val Lys Leu Tyr Gin Thr Glu Gin Glu Glu Ala Leu Phe He 

180 185 190 

Ala Arg Phe Ser Pro Pro Ser Cys Gly Val Gin Val Asn Lys Leu Trp 

195 200 205 

Tyr Lys Pro Val Glu Gin Phe He Leu Pro Glu Ser Phe Glu Gly Thr 

210 215 220 

He Val Trp Glu Ser Gin Asp Leu His Gly Leu Val Ser Arg Asn Leu 
225 230 235 240 

His Arg Val Thr Val Asn Asp Gly Gly Gly Val Leu Arg Val Leu Ala 

245 250 255 

Ala Gly Glu Gly Ala Leu Pro His Glu Phe Met Glu Gly Val Glu Gly 

260 265 270 

Val Ala Gly Gly Phe lie Tyr Thr Val Gin Glu Gly Asp Ala Leu Leu 

275 280 285 

Arg Ser Leu Tyr Ser Arg Pro Gin Arg Leu Ala Asp His He Gin Asp 

290 295 300 

Leu Gin Val Giu Asp Ala Leu Leu Gin Glu Glu Ser Ser Val His Asp 
305 310 315 320 

Asp He Val Phe Val Asp Val Val Asp Thr Tyr Arg Asn Val Pro Ala 

325 330 335 

Lys Leu Leu Asn Phe Tyr Arg Trp Thr Val Glu Ser Thr Ser Phe Asp 

340 345 350 

Leu Leu Leu Lys Thr Asp Asp Asp Cys Tyr lie Asp Leu Glu Ala Val 

355 360 365 

Phe Asn Arg He Ala Gin Lys Asn Leu Asp Gly Pro Asn Phe Trp Trp 
370 375 380 
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Gly Asn Phe Arg Leu Asn Trp Ala Val Asp Arg Thr Gly Lys Trp Gin 
385. 390 395 400 

Glu Leu Glu Tyr Pro Ser Pro Ala Tyr Pro Ala Phe Ala Cys Gly Ser 

405 410 415 

Gly Tyr Val lie Ser Lys Asp He Val Asp Trp Leu Ala Gly Asn Ser 

420 425 430 

Arg Arg Leu Lys Thr Tyr Gin Gly Glu Asp Val Ser Met Gly He Trp 

435 440 445 

Mel Ala Ala He Gly Pro Lys Arg His Gin Asp Ser Leu Trp Leu Cys 

450 455 460 

Glu Lys Thr Cys Glu Thr Gly Met Leu Ser Ser Pro Gin Tyr Ser Pro 
465 470 475 480 

Glu Glu Leu Ser Lys Leu Trp Glu Leu Lys Glu Leu Cys Gly Asp Pro 

485 490 495 

Cys Gin Cys Glu Ala Lys Val Arg 
500 504 



<210> 5 
<2 1 1 > 37 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: 5' primer for PCR 
<400> 5 

cccaagct lg ggcctgcaga Icagltggcc ttatttc 
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<2lO> 6 
<2H> 42 
<2l2> DNA 

<2l3> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: 3' primer for PCR 
<400> 6 

aacgcggatc cgcgctgtta tctlgcttga catcgacaag ga 

<210> 7 
<21l> 56 
<212> DNA 

<2l3> Arti ficial Sequence 
<220> 

<223> Description of Artificial Sequence: 5' primer for PCR 
<400> 7 

ggggacaagt ttgtacaaaa aagcaggctt ccctgcagat cagttggcct tatttc 

<210> 8 
<2ll> 58 
<2l2> DNA 

<2l3> Art i ficial Sequence 
<220> 
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<223> Description of Artificial Sequence: 3' primer for PCR 
<400> 8 

ggggaccact ttglacaaga aagctgggtc ctgttatctt gcttgacatc gacaagga 

<2l0> 9 
<2ll> 22 
<2l2> PRT 

<213> Ar t i f i ci al Sequence 

<220> 

<223> Description of Artificial Sequence: Ig/csignal sequence 
<400> 9 

Met His Phe Gin Val Gin lie Phe Ser Phe Leu Leu He Ser Ala Ser 
l 5 10 15 

Val lie Met Ser Arg Gly 
20 22 



<210> 10 
<211> 8 
<212> PRT 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: FLAG peptide 
<400> 10 
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Asp Tyr Lys Asp Asp Asp Asp Lys 
5 I 5 8 

<2lO> ll 
<2ll> 94 
<2!2> DNA 
15 <2l3> Artificial Sequence 

<220> 

20 <223> Description of Artificial Sequence: primer 0T3 

<400> ll 

25 

gatcatgcat tttcaagtgc agattttcag cttcctgcta atcagtgcct cagtcataat 60 

gtcacgtgga gattacaagg acgacgatga caag 

94 

30 

. \ 

<2l0> 12 
35 <2 1 1 > 26 

<212> DNA 

<213> Artificial Sequence 

40 

<220> 

<223> Description of Artificial Sequence: primer 0T20 

45 

. <400> 12 

so cgggatccat gcattttcaa gtgcag 26 

55 <210> 13 
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<2ll> 25 
<2l2> DNA 

<2l3> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: primer 0T21 
<400> 13 

ggaaltcttg tcatcgtcgt ccttg 

<210> 14 
<211> 21 
<212> DNA 

<2 I 3> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: 5' primer for PCR 
<400> 14 

ggagtgttct acgatgccaa t 

<210> 15 
<211> 20 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: 3' primer for PCR 
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<400> 15 

ctgaagcgag caatgaagag 

<210> 16 
<211> 32 
<212> DNA 

<213> Arti f icial Sequence 
<220> 

<223> Description of Artificial Sequence: TaqMan Probe 
<400> 16 

cactgtcaaa ctttatcagg cagaacaaga gg 

<210> 17 
<2 1 1 > 37 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: 5' primer for 
<400> 17 

cccaagcttg ggagcgcggc agatcaatca gccttat 

<210> 18 
<2 1 1 > 53 
<212> DNA 
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<2l3> Ar t i f icial Sequence 
<220> 

<223> Descript ion of Artificial Sequence: 3' primer for PCR 
<400> 18 

ttttcctttt gcggccgctt ttttcctttc atcgtacttt tgcttcacac tga 

<210> 19 
<211> 248 
<212> PRT 

<213> Homo sapiens 
<220> 

<223> b3Gal-Tl 
<400> 19 

Phe Leu Val He Leu He Ser Thr 
1 5 
Gin Ala He Arg Glu Thr Trp Gly 
20 

Lys lie Ala Thr Leu Phe Leu Leu 
35 40 
Asn Gin Met Val Glu Gin Glu Ser 

50 55 
Glu Asp Phe lie Asp Ser Tyr His 
65 70 
Gly Met Arg Trp Val Ala Thr Phe 
85 



Thr His Lys Glu Phe Asp Ala Arg 

10 15 

Asp Glu Asn Asn Phe Lys Gly lie 

25 30 

Gly Lys Asn Ala Asp Pro Val Leu 
45 

Gin He Phe His Asp He lie Val 
60 

Asn Leu Thr Leu Lys Thr Leu Met 
75 80 

Cys Ser Lys Ala Lys Tyr Val Met 
90 95 
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10 



15 



Lys Thr Asp Ser Asp He Phe Val Asn Mel Asp Asn Leu lie Tyr Lys 

100 ]05 110 

Leu Leu Lys Pro Ser Thr Lys Pro Arg Arg Arg Tyr Phe Thr Gly Tyr 

115 120 125 

Val lie Asn Gly Gly Pro He Arg Asp Val Arg Ser Lys Trp Tyr Met 

130 135 140 

Pro Arg Asp Leu Tyr Pro Asp Ser Asn Tyr Pro Pro Phe Cys Ser Gly 
145 150 155 160 

Thr Gly Tyr He Phe Ser Ala Asp Val Ala Glu Leu He Tyr Lys Thr 
20 1 65 1 70 1 75 

Ser Leu His Thr Arg Leu Leu His Leu Glu Asp Val Tyr Vai Gly Leu 

180 185 190 

Ser Leu His Thr Arg Leu Leu His Leu Glu Asp Val Tyr Val Gly Leu 

195 200 205 

His Trp Lys Met Ala Tyr Ser Leu Cys Arg Tyr Arg Arg Val lie Thr 

210 215 220 

Val His Gin lie Ser Pro Glu Glu Met His Arg He Trp Asn Asp Mel 
225 230 235 240 

Ser Ser Lys Lys His Leu Arg Cys 
245 248 



25 



30 



35 



40 



45 



50 



55 



<210> 20 
<211> 271 
<212> PRT 

<213> Homo sapiens 
<220> 

<223> b3Gal-T2 
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<400> 20 

Phe Leu lie Leu Leu He Ala Ala Glu Pro Gly Gin He Glu Ala Arg 

15 10 15 

Arg Ala He Arg Gin Thr Trp Gly Asn Glu Ser Leu Ala Pro Gly He 

20 25 30 

Gin He Thr Arg He Phe Leu Leu Gly Leu Ser He Lys Leu Asn Gly 

35 40 45 

Tyr Leu Gin Arg Ala He Leu Glu Glu Ser Arg Gin Tyr His Asp He 

50 55 60 

He Gin Gin Glu Tyr Leu Asp Thr Tyr Tyr Asn Leu Thr He Lys Thr 
65 70 75 80 

Leu Met Gly Met Asn Trp Val Ala Thr Tyr Cys Pro His He Pro Tyr 

85 90 95 

Val Met Lys Thr Asp Ser Asp Met Phe Val Asn Thr Glu Tyr Leu He 

100 105 110 

Asn Lys Leu Leu Lys Pro Asp Leu Pro Pro Arg His Asn Tyr Phe Thr 

115 120 125 

Gly Tyr Leu Met Arg Gly Tyr Ala Pro Asn Arg Asn Lys Asp Ser Lys 

130 135 140 

Trp Tyr Met Pro Pro Asp Leu Tyr Pro Ser Glu Arg Tyr Pro Val Phe 
145 150 155 160 

Cys Ser Gly Thr Gly Tyr Val Phe Ser Gly Asp Leu Ala Glu Lys He 

165 170 175 

Phe Lys Val Ser Leu Gly lie Arg Arg Leu His Leu Glu Asp Val Tyr 

180 185 190 

Val Gly He Cys Leu Ala Lys Leu Arg He Asp Pro Val Pro Pro Pro 

195 200 205 

Asn Glu Phe Val Phe Asn His Trp Arg Val Ser Tyr Ser Ser Cys Lys 
210 215 220 
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Tyr Ser His Leu He Thr Ser His Gin Phe Gin Pro Ser Glu Leu He 

225 230 235 240 

Lys Tyr Trp Asn His Leu Gin Gin Asn Lys His Asn Ala Cys Ala Asn 

245 250 255 

Ala Ala Lys Glu Lys Ala Gly Arg Tyr Arg His Arg Lys Leu His 
260 265 270 271 



<210> 21 
<211> 253 
<212> PRT 

<213> Homo sapiens 



<220> 

<223> b3Gal-T3 
<400> 21 

Phe Leu Val He Leu Val Thr Ser His Pro Ser Asp Val Lys Ala Arg 

1 5 10 15 

Gin Ala He Arg Val Thr Trp Gly. Glu Lys Lys Ser Trp Trp Gly Tyr 

20 25 30 

Glu Val Leu Thr Phe Phe Leu Leu Gly Gin Glu Ala Glu Lys Glu Asp 

35 40 45 

Lys Met Leu Ala Leu Ser Leu Glu Asp Glu His Leu Leu Tyr Gly Asp 

50 55 60 

He He Arg Gin Asp Phe Leu Asp Thr Tyr Asn Asn Leu Thr Leu Lys 
65 70 75 80 

Thr lie Met Ala Phe Arg Trp Val Thr Glu Phe Cys Pro Asn Ala Lys 
85 90 95 
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<400> 22 

Phe Leu Val Leu Leu Val Thr Ser Ser His Lys Gin Leu Ala Glu Arg 

15 10 15 

Met Ala He Arg Gin Thr Trp Gly Lys Glu Arg Met Val Lys Gly Lys 

20 25 30 

Gin Leu Lys Thr Phe Phe Leu Leu Gly Thr Thr Ser Ser Ala Ala Glu 

35 40 45 

Thr Lys Glu Val Asp Gin Glu Ser Gin Arg His Gly Asp He He Gin 

50 55 60 

Lys Asp Phe Leu Asp Val Tyr Tyr Asn Leu Thr Leu Lys Thr Met Met 
65 70 75 80 

Gly He Glu Trp Val His Arg Phe Cys Pro Gin Ala Ala Phe Val Met 

85 90 95 

Lys Thr Asp Ser Asp Met Phe He Asn Val Asp Tyr Leu Thr Glu Leu 

100 105 110 

Leu Leu Lys Lys Asn Arg Thr Thr Arg Phe Phe Thr Gly Phe Leu Lys 

115 120 125 

Leu Asn Glu Phe Pro He Arg Gin Pro Phe Ser Lys Trp Phe Val Ser 

130 135 140 

Lys Ser Glu Tyr Pro Trp Asp Arg Tyr Pro Pro Phe Cys Ser Gly Thr 
145 150 155 160 

Gly Tyr Val Phe Ser Gly Asp Val Ala Ser Gin Val Tyr Asn Val Ser 

165 170 175 

Lys Ser Val Pro Tyr lie Lys Leu Glu Asp Val Phe Val Gly Leu Cys 

180 185 190 

Leu Glu Arg Leu Asn He Arg Leu Glu Glu Leu His Ser Gin Pro Thr 

195 200 205 

Phe Phe Pro Gly Gly Leu Arg Phe Ser Val Cys Leu Phe Arg Arg He 
210 215 220 
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Tyr Val Met Lys Thr Asp Thr Asp Val Phe He Asn Thr Gly Asn Leu 

100 105 110 

Val Lys Tyr Leu Leu Asn Leu Asn His Ser Glu Lys Phe Phe Thr Gly 

115 120 125 

Tyr Pro Leu He Asp Asn Tyr Ser Tyr Arg Gly Phe Tyr Gin Lys Thr 

130 135 140 

His He Ser Tyr Gin Glu Tyr Pro Phe Lys Val Phe Pro Pro Tyr Cys 
145 150 155 160 

Ser Gly Leu Gly Tyr lie Met Ser Arg Asp Leu Val Pro Arg He Tyr 

165 170 175 

Glu Met Met Gly His Val Lys Pro He Lys Phe Glu Asp Val Tyr Val 

180 185 190 

Gly He Cys Leu Asn Leu Leu Lys Val Asn He His lie Pro Glu Asp 

195 200 205 

Thr Asn Leu Phe Phe Leu Tyr Arg lie His Leu Asp Val Cys Gin Leu 

210 215 220 

Arg Arg Val lie Ala Ala His Gly Phe Ser Ser Lys Glu He lie Thr 
225 230 235 240 

Phe Trp Gin Val Met Leu Arg Asn Thr Thr Cys His Tyr 
245 250 253 

<210> 22 
<2H> 253 
<212> PRT 

<213> Homo sapiens 
<220> 

<223> b3Gal-T5 
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Val Ala Cys His Phe He Lys Pro Arg Thr Leu Leu Asp Tyr Trp Gin 
225 230 235 240 

Ala Leu Glu Asn Ser Arg Gly Glu Asp Cys Pro Pro Val 
245 250 253 

<2l0> 23 

<2ll> 272 

<2l2> PRT 

<2l3> Homo sapiens 

<220> 

<223> b3Gal-T6 
<400> 23 

Phe Leu Ala Val Leu Val Ala Ser Ala Pro Arg Ala Ala Glu Arg Arg 

1 5 10 15 

Ser Val He Arg Ser Thr Trp Leu Ala Arg Arg Gly Ala Pro Gly Asp 

20 25 30 

Val Trp Ala Arg Phe Ala Val Gly Thr Ala Gly Leu Gly Ala Glu Glu 

35 40 45 

Arg Arg Ala Leu Glu Arg Glu Gin Ala Arg His Gly Asp Leu Leu Leu 

50 55 60 

Leu Pro Ala Leu Arg Asp Ala Tyr Glu Asn Leu Thr Ala Lys Val Leu 
65 70 75 80 

Ala Met Leu Ala Trp Leu Asp Glu His Val Ala Phe Glu Phe Val Leu 

85 90 95 

Lys Ala Asp Asp Asp Ser Phe Ala Arg Leu Asp Ala Leu Leu Ala Glu 
100 105 110 
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Leu Arg Ala Arg Glu Pro Ala Arg Arg Arg Arg Leu Tyr Trp Gly Phe 

1 1 5 120 125 

Phe Ser Gly Arg Gly Arg Val Lys Pro Gly Gly Arg Trp Arg Glu Ala 

130 135 140 

Ala Trp Gin Leu Cys Asp Tyr Tyr Leu Pro Tyr Ala Leu Gly Gly Gly 
145 150 155 160 

Tyr Val Leu Ser Ala Asp Leu Val His Tyr Leu Arg Leu Ser Arg Asp 

165 170 175 

Tyr Leu Arg Ala Trp His Ser Glu Asp Val Ser Leu Gly Ala Trp Leu 

180 185 190 

Ala Pro Val Asp Val Gin Arg Glu His Asp Pro Arg Phe Asp Thr Glu 

195 200 205 

Tyr Arg Ser Arg Gly Cys Ser Asn Gin Tyr Leu Val Thr His Lys Gin 

210 215 220 

Ser Leu Glu Asp Met Leu Glu Lys His Ala Thr Leu Ala Arg Glu Gly 
225 230 235 240 

Arg Leu Cys Lys Arg Glu Val Gin Leu Arg Leu Ser Tyr Val Tyr Asp 

245 250 255 

Trp Ser Ala Pro Pro Ser Gin Cys Cys Gin Arg Arg Glu Gly lie Pro 
260 265 270 272 

<210> 24 
<211> 255 
<212> PRT 

<213> Homo sapiens 
<220> 

<223> b3GnT2 
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Leu Arg Ala Arg Glu Pro Ala Arg Arg Arg Arg Leu Tyr Trp Gly Phe 

H5 120 125 

Phe Ser Gly Arg Gly Arg Val Lys Pro Gly Gly Arg Trp Arg Glu Ala 

130 135 140 

Ala Trp Gin Leu Cys Asp Tyr Tyr Leu Pro Tyr Ala Leu Gly Gly Gly 
145 150 155 160 

Tyr Val Leu Ser Ala Asp Leu Val His Tyr Leu Arg Leu Ser Arg Asp 

165 170 175 

Tyr Leu Arg Ala Trp His Ser Glu Asp Val Ser Leu Gly Ala Trp Leu 

180 185 190 

Ala Pro Val Asp Val Gin Arg Glu His Asp Pro Arg Phe Asp Thr Glu 

195 200 205 

Tyr Arg Ser Arg Gly Cys Ser Asn.Gln Tyr Leu Val Thr His Lys Gin 

210 215 .220 

Ser Leu Glu Asp Mel Leu Glu Lys His Ala Thr Leu Ala Arg Glu Gly 
225 230 235 240 

Arg Leu Cys Lys Arg Glu Val Gin Leu Arg Leu Ser Tyr Val Tyr Asp 

245 250 255 

Trp Ser Ala Pro Pro Ser Gin Cys Cys Gin Arg Arg Glu Gly He Pro 
260 265 . 27.0 272 

<210> 24 
<2 1 1 > 255 
<212> PRT 

<213> Homo sapiens 
<220> 

<223> t>3GoT2 
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<400> 24 

Phe Leu Leu Leu Ala lie Lys Ser Leu Thr Pro His Phe Ala Arg Arg 
1 5 10 15 

Gin Ala He Arg Glu Ser Trp Gly Gin Glu Ser Asn Ala Gly Asn Gin 

20 25 30 

Thr Val Val Arg Val Phe Leu Leu Gly Gin Thr Pro Pro Glu Asp Asn 

35 40 45 

His Pro Asp Leu Ser Asp Met Leu Lys Phe Glu Ser Glu Lys His Gin 
50 55 60 

20 Asp He Leu Met Trp Asn Tyr Arg Asp Thr Phe Phe Asn Leu Ser Leu 

65 70 75 80 

Lys Glu Val Leu Phe Leu Arg Trp Val Ser Thr Ser Cys Pro Asp Thr 

85 90 95 

Glu Phe Val Phe Lys Gly Asp Asp Asp Val Phe Val Asn Thr His His 

100 105 110 

lie Leu Asn Tyr Leu Asn Ser Leu Ser Lys Thr Lys Ala Lys Asp Leu 
115 120 125 

35 Phe lie Gly Asp Val He His Asn Ala Gly Pro His Arg Asp Lys Lys 

130 135 140 

Leu Lys Tyr Tyr lie Pro Glu Val Val Tyr Ser Gly Leu Tyr Pro Pro 
145 150 155 160 

Tyr Ala Gly Gly Gly Gly Phe Leu Tyr Ser Gly His Leu Ala Leu Arg 

165 170 175 

Leu Tyr His lie Thr Asp Gin Val His Leu Tyr Pro He Asp Asp Val 
180 185 190 

50 Tyr Thr Gly Met Cys Leu Gin Lys Leu Gly Leu Val Pro Glu Lys His 

195 200 205 

Lys Gly Phe Arg Thr Phe Asp He Glu Glu Lys Asn Lys Asn Asn He 
210 215 220 
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Cys Ser Tyr Val Asp Leu Met Leu Va) His Ser Arg Lys Pro Gin Glu 
225 230 235 240 

Met He Asp He Trp Ser Gin Leu Gin Ser Ala His Leu Lys Cys 
245 250 255 

<2l0> 25 
<2ll> 265 
<2l2> PRT 

<2l3> Homo sapiens 
<220> 

<223> b3GnT3 



<400> 25 

Phe Leu Leu Leu Val He Lys Ser Ser Pro Ser Asn Tyr Val Arg Arg 
l 5 10 15 

Glu Leu Leu Arg Arg Thr Trp Gly Arg Glu Arg Lys Val Arg Gly Leu 
35 20 25 30 

Gin Leu Arg Leu Leu Phe Leu Val Gly Thr Ala Ser Asn Pro His Glu 

35 40 45 

Ala Arg Lys Val Asn Arg Leu Leu Glu Leu Glu Ala Gin Thr His Gly 

50 55 60 

Asp He Leu Gin Trp Asp Phe His Asp Ser Phe Phe Asn Leu Thr Leu 
65 70 75 80 

Lys Gin Val Leu Phe Leu Gin Trp Gin Glu Thr Arg Cys Ala Asn Ala 
so 85 -90 95 

Ser Phe Val Leu Asn Gly Asp Asp Asp Val Phe Ala His Thr Asp Asn 

100 105 110 

Met Val Phe Tyr Leu Gin Asp His Asp Pro Gly Arg His Leu Phe Val 
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115 120 125 

Gly Gin Leu He Gin Asn Val Gly Pro He Arg Ala Phe Trp Ser Lys 

1 30 135 HO 

Tyr Tyr Val Pro Glu Val Val Thr Gin Asn Glu Arg Tyr Pro Pro Tyr 
145 150 155 l 60 

Cys Gly Gly Gly Gly Phe Leu Leu Ser Arg Phe Thr Ala Ala Ala Leu 

165 170 175 

Arg Arg Ala Ala His Val Leu Asp He Phe Pro He Asp Asp Val Phe 

180 185 190 

Leu Gly Met Cys Leu Glu Leu Glu Gly Leu Lys Pro Ala Ser His Ser 

1 95 200 205 

Gly He Arg Thr Ser Gly Val Arg Ala Pro Ser Gin His Leu Ser Ser 

210 2 15 220 

Phe Asp Pro Cys Phe Tyr Arg Asp Leu Leu Leu Val His Arg Phe Leu 
225 230 235 240 

Pro Tyr Glu Met Leu Leu Met Trp Asp Ala Leu Asn Gin Pro Asn Leu 

245 250 255 

Thr Cys Gly Asn Gin Thr Gin He Tyr 
260 265 

<210> 26 

<211> 260 

<212> PRT 

<213> Homo sapiens 

<220> 

<223> b3GnT4- 
<400> 26 
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Phe Leu Leu Leu Ala He Lys Ser Gin Pro Gly His Val Glu Arg Arg 
I 5 10 15 

Ala Ala He Arg Ser Thr Trp Gly Arg Val Gly Gly Trp Ala Arg Gly 

20 25 30 

Arg Gin Leu Lys Leu Val Phe Leu Leu Gly Val Ala Gly Ser Ala Pro 

35 40 45 

Pro Ala Gin Leu Leu Ala Tyr Glu Ser Arg Glu Phe Asp Asp He Leu 

50 55 60 

Gin Trp Asp Phe Thr Glu Asp Phe Phe Asn Leu Thr Leu Lys Glu Leu 
20 65 70 75 80 

His Leu Gin Arg Trp Val Val Ala Ala Cys Pro Gin Ala His Phe Met 

85 90 95 

Leu Lys Gly Asp Asp Asp Val Phe Val His Val Pro Asn Val Leu Glu 

100 105 110 

Phe Leu Asp Gly Trp Asp Pro Ala Gin Asp Leu Leu Val Gly Asp Val 

H5 120 125 

lie Arg Gin Ala Leu Pro Asn Arg Asn Thr Lys Val Lys Tyr Phe He 
35 130 135 140 

Pro Pro Ser Met Tyr Arg Ala Thr His Tyr Pro Pro Tyr Ala Gly Gly 
145 _ 150 155 160 

Gly Gly Tyr Val Met Ser Arg Ala Thr Val Arg Arg Leu Gin Ala lie 

165 170 175 

Met Glu Asp Ala Glu Leu Phe Pro He Asp Asp Val Phe Val Gly Met 

180 185 190 

Cys Leu Arg Arg Leu Gly Leu Ser Pro Met His His Ala Gly Phe Lys 

195 200 205 

Thr Phe Gly He Arg Arg Pro Leu Asp Pro Leu Asp Pro Cys Leu Tyr 

210 215 220 

Arg Gly Leu Leu Leu Val His Arg Leu Ser Pro Leu Glu Mel Trp Thr 
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225 230 235 240 

Met Trp Ala Leu Yal Thr Asp Glu Gly Leu Lys Cys Ala Ala Gly Pro 
245 250 255 

I le Pro Gin Arg 
260 

<2l0> 27 

<2ll> 290 

<2l2> PRT 

<2l3> Homo sapiens 

<220> 

<223> b3GnT5 
<400> 27 

Leu Leu Leu Leu Phe Val Lys Thr Ala Pro Glu Asn Tyr Asp Arg Arg 

l 5 10 15 

Ser Gly He Arg Arg Thr Trp Gly Asn Glu Asn Tyr Val Arg Ser Gin 

20 25 30. 

Leu Asn Ala Asn lie Lys Thr Leu Phe Ala Leu Gly Thr Pro Asn Pro 

35 40 45 

Leu Glu Gly Glu Glu Leu Gin Arg Lys Leu Ala Trp Glu Asp Gin Arg 

50 55 60 

Tyr Asn Asp He He Gin Gin Asp Phe Val Asp Ser Phe Tyr Asn Leu 
65 70 75 80 

Thr Leu Lys Leu Leu Met Gin Phe Ser Trp Ala Asn Thr Tyr Cys Pro 

85 90 95 

His Ala Lys Phe Leu Met Thr Ala Asp Asp Asp He Phe He His Met 
100 105 110 
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Pro Asn Leu He Glu Tyr Leu Gin Ser Leu Glu Gin He Gly Val Gin 

115 120 125 

Asp Phe Trp He Gly Arg Val His Arg Gly Ala Pro Pro He Arg Asp 

130 135 140 

Lys Ser Ser Lys Tyr Tyr Val Ser Tyr Glu Met Tyr Gin Trp Pro Ala 
145 150 155 160 

is Tyr Pro Asp Tyr Thr Ala Gly Ala Ala Tyr Val lie Ser Gly Asp Val 

165 170 175 

Ala Ala Lys Val Tyr Glu Ala Ser Gin Thr Leu Asn Ser Ser Leu Tyr 

180 185 190 

He Asp Asp Val Phe Met Gly Leu Cys Ala Asn Lys He Gly He Val 

195 200 205 

Pro Gin Asp His Val Phe Phe Ser Gly Glu Gly Lys Thr Pro Tyr His 

210 215 220 

Pro Cys He Tyr Glu Lys Met Met Thr Ser His Gly His Leu Glu Asp 
225 230 235 240 

Leu Gin Asp Leu Trp Lys Asn Ala Thr Asp Pro Lys Val Lys Thr He 

245 250 255 

Ser Lys Gly Phe Phe Gly Gin He Tyr Cys Arg Leu Mel Lys He He 

260 265 270 

Leu Leu Cys Lys He Ser Tyr Val Asp Thr Tyr Pro Cys Arg Ala Ala 
275 280 285 

43 Phe lie 
290 
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Claims 



1. A |31 ,3-N-acetyl-D-galactosamine transferase protein which transfers N-acetyl-D-galactosamine to N-acetyl-D-glu- 
55 cosamine with 01 ,3 linkage. 

2. The glycosyltransferase protein according to claim 1 , which has at least one of the following properties (a) to (c): 
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(a) acceptor substrate specificity 

when using an oligosaccharide as an acceptor substrate, the protein shows transferase activity toward 
Bz-p-GlcNAc, GlcNAc-pi -4-GlcNAc-p-Bz, Gal-p1-3-(GlcNAc-p1 -6) GalNAc-a-pNp, GlcNAc-p1-3-GalNAc-a- 
pNp and GlcNAc-pi -6-GalNAc-a-pNp ("GlcNAc" represents an N-acetyl-D-glucosamine residue, "GalNAc" 
5 represents an N-acetyl-D-galactosamine residue, "Bz" represents a benzyl group, "pNp" represents a p-nitro- 

phenyl group, and "-" represents a glycosidic linkage. Numbers in these formulae each represent the carbon 
number in the sugar ring where a glycosidic linkage is present, and "a" and "p" represent anomers of the 
glycosidic linkage at the 1 -position of the sugar ring. An anomer whose positional relationship with CH 2 OH or 
CH 3 at the 5-position is trans and cis is represented by "a" and "p", respectively); 
10 (b) reaction pH 

the activity is lower in a pH range of 6.2 to 6.6 than in other pH ranges; or 
(c) divalent ion requirement 

although the activity is enhanced at least in the presence of Mn 2+ , Co 2+ or Mg 2+ , the Mn 2+ -induced 
enhancement of the activity is almost completely eliminated in the presence of Cu + . 



15 



3. A glycosyltransferase protein which comprises the following polypeptide (A) or (B): 



(A) a polypeptide which has the amino acid sequence shown in SEQ ID NO: 2 or 4; or 

(B) a polypeptide which has an amino acid sequence with substitution, deletion or insertion of one or more 
20 amino acids in the amino acid sequence shown in SEQ ID NO: 2 or 4 and which transfers N-acetyl-D-galac- 

tosamine to N-acety»-D-glucosamine with pi ,3 linkage. 

4. The glycosyltransferase protein according to claim 3, wherein the polypeptide (A) consists of a polypeptide having 
an amino acid sequence covering amino acids 189 to 500 shown in SEQ ID NO: 2. 

25 

5. The glycosyltransferase protein according to claim 3, wherein the polypeptide (A) consists of a polypeptide having 
an amino acid sequence covering amino acids 36 to 500 shown in SEQ ID NO: 2. 

6. The glycosyltransferase protein according to claim 3, which consists of a polypeptide having an amino acid se- 
30 quence sharing at least more than 30% identity with an amino acid sequence covering amino acids 189 to 500 

shown in SEQ ID NO: 2 or amino acids 35 to 504 shown in SEQ ID NO: 4. 

7. A nucleic acid consisting of a nucleotide sequence encoding the polypeptide according to any one of claims 3 to 
6 or a nucleotide sequence complementary thereto. 

35 

8. The nucleic acid according to claim 7, which consists of the nucleotide sequence shown in SEQ ID NO: 1 or 3 or 
a nucleotide sequence complementary to at least one of them. 

9. The nucleic acid according to claim 7, which consists of a nucleotide sequence covering nucleotides 565 to 1 503 
40 shown in SEQ ID NO: 1 or a nucleotide sequence complementary thereto. 

10. The nucleic acid according to claim 7, which consists of a nucleotide sequence covering nucleotides 1 06 to 1503 
shown in SEQ ID NO: 1 or a nucleotide sequence complementary thereto. 

45 11. The nucleic acid according to claim 7, which consists of a nucleotide sequence covering nucleotides 103 to 1512 
shown in SEQ ID NO: 3 or a nucleotide sequence complementary thereto. 

12. The nucleic acid according to any one of claims 7 to 11 , which is DNA. 

50 13. A vector carrying the nucleic acid according to any one of claims 7 to 12. 

14. A transformant containing the vector according to claim 13. 

15. A method for producing a p1 ,3-N-acetyl-D-galactosamine transferase protein, which comprises growing the trans- 
55 formant according to claim 14 to express the glycosyltransferase protein and collecting the glycosyltransferase 

protein from the transformant. 

16. An antibody recognizing the pi ,3-N-acetyl-D-galactosamine transferase protein according to any one of claims 1 to 6. 
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Figure 1 
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Figure 2A 




JNSDOCID: <EP 15959SSA1J_> 



62 



EP 1 595 955 A1 



Figure 2A (continued) 

G34, noesyprtp, 0. 9s, 298K, 03-01-09 
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Figure 2B (continued) 
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Figure 3 



Table 1 



1H Chemical shift 
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Table 2 



Coupling coefficient 
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Figure 4 



Table 3 
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Figure 5 
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Figure 7 
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Figure 9 (continued) 
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Figure 9 (continued) 
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Figure 9 (continued) 
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Figure 10 
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